Closed SilvanScherrer closed 7 years ago
The current release is FIREFOX_45_5_0esr_RELEASE (4 weeks ago), this is what I will import now.
The vendor branch is updated, I'm merging it to master now. Note that there is a huge amount of changes between 38 and 45 and that's expected. There is about 90 files with conflicts and it will take a while to resolve them despite the fact that conflicts are usually small. I'm also trying to do it right from the first commit but that requires to do some code base lookups in the old and the new version to make a proper decision. After all, fixing conflicts the right way now will save a lot of time later.
On 11/29/16 03:17 AM, Dmitriy Kuminov wrote:
The current release is FIREFOX_45_5_0esr_RELEASE (4 weeks ago), this is what I will import now.
45_5_1 should be out today to fix a zero day vulnerability. Probably only affects Windows but sometimes these vulnerabilities still crash the browser on other platforms. https://www.wordfence.com/blog/2016/11/emergency-bulletin-firefox-0-day-wild/
Okay, thanks for the info. Though this will certainly have to wait until I merge anyway.
45.5.0 is merged in the above commit. The amount of changes is somewhat comparable to the merge of 38.1.0 (i.e. switching from 31 to 38). The next step is to build it of course. I'm sure this will also take some time.
I have some problems when pulling the current repo contents to my OS/2 machine. The fetch phase works well, the new commits are successfully downloaded. But the merge (fast-forward) phase doesn't go well: git either crashes or hangs. Our git build is rather old so no TRP files, nothing. It may be a memory issue. Investigating.
Regarding the git crash, I get exactly this: http://trac.netlabs.org/ports/ticket/38. We will have to fix git in order to proceed further.
Git pull seems to have stopped working here. My fallback is to do git fetch git reset --hard origin/master which will also remove local changes if you don't have them stashed. There are other options for the reset command.
Well git fetch isn't necessary after a failed pull as fetch is already done but yes, doing what you suggest should also bring the local copy up to date. However, we still need to fix git as it's not guaranteed that reset won't break too one day. And btw, there is also http://trac.netlabs.org/ports/ticket/39 which I get here if git doesn't crash or hang.
@dryeo JFYI, new git 2.11.0 has been just released as RPM. It works well here with both the gcc and mozilla repos, feel free to try.
On 12/13/16 04:32 AM, Dmitriy Kuminov wrote:
@dryeo https://github.com/dryeo JFYI, new git 2.11.0 has been just released as RPM. It works well here with both the gcc and mozilla repos, feel free to try.
It works much the same as the previous release. Still no garbage collection :) FYI, my previous problem with pull was actually a sh problem, wrong libcx
@dryeo hmm, that's something different. Could you please open a new ticket or reopen one of these http://trac.netlabs.org/ports/ticket/55, http://trac.netlabs.org/ports/ticket/56 if it applies to you? With more detail so that I could reproduce it locally.
After a few fixes configure runs fine, build time now.
Parts of OS/2 code were adapted to upstream changes in 874756ff0d28b8ba73bbd57cc91c0d4a39f85b11 and df96d00de0604e3dcc928870e9421e44e7afdbf0. There seems to be a lot of refactoring and this results in small and (relatively) big code breaks here and there.
More to come.
First obstacle: it seems that the GL library is now not optional. There is GLContextProviderEGL
which is always present on all platforms and it seems to use libGLESv2
. I need to investigate how to disable it as I doubt we have any suitable GL implementation on OS/2.
There is a working open source non-accelerated OpenGL implementation, but I suppose FF requires at least OpenGL 2, so it wouldn't work out of the box.
Yes, there is also some IBM work on Hobbes (http://hobbes.nmsu.edu/download/pub/os2/system/patches/opengl_gold_111.wpi, http://hobbes.nmsu.edu/download/pub/os2/system/patches/opengl.zip), I don't know if they are derived or not, but my fair guess is that these are very old implementations not suitable for Firefox (i.e. they need to be ported on their own before they can be used with FF and this is certainly beyond this ticket's task).
I found a way to disable GL completely and that's what I plan to do for now.
It also turns out that our RPM build of the ICU library which we now use for Firefox (instead of its own copy) disables some "internal" defines with #define U_HIDE_INTERNAL_API 1
in uconfig.h
but the new Mozilla code in gfxHardBuzzShaper.cpp
makes use of UTEXT_INITIALIZER which in turn needs UTEXT_MAGIC which is considered part of U_HIDE_INTERNAL_API. Since in the ICU build supplied with FF, U_HIDE_INTERNAL_API isn't set, UTEXT_MAGIC is available and all works. But not in our build. This looks like a defect of ICU headers to me.
Please discard the last comment. It was my local ICU build with some hacks which I don't remember why I did. Reverting it to the official RPM version (56.1-1) solves the UTEXT_MAGIC problem.
Another obstacle: they added a new library, protobuf (toolkit/components/protobuf
). It might need some alignment to OS/2.
For the GL issue I created a separate ticket, #192.
Next big obstacle that needs porting is WebRTC. The WebRTC code itself hasn't changed a lot since ers38 so most likely WebRTC was optional back then and hence didn't cause any problems on OS/2.
There are actually two implementations of WebRTC now. The one that wasn't used before is chromium WebRTC. Now it's used for some camera stuff if I get it correctly. I will check more, may be I will simply disable its usage for now (I see some defines for that).
There was some discussion on OS2World about webrtc, http://www.os2world.com/forum/index.php/topic,1236.msg11725.html#msg11725 Wim pointed out that our "usbecd.sys has by design insufficient isochronous buffering capability to sustain high speed and high bandwidth operation especially on larger image sizes"
@dryeo thanks Dave, though it was a false alarm. It turns out to be a wrong include dir order, fixed in f3b123227513147b22612b179391072acba8c23e (FF didn't use webrtc leaders in ipc/glue back then so the problem didn't exist). WebRTC is still disabled on a few platforms (including OS/2) by default so it's something to do in the future. I created a ticket for that (#193).
Hmm, turns out that the GL stuff needs more work. There is a bunch of EGL headers that claim our platform is not supported. This is when compiling the gfx/angle part. (Yes, mozilla source code organization really sucks, the order in which things are built doesn't always match the directory tree).
EGL is done (was easy), now it seems that they updated libvpx and I need to see if it's possible to use the external one or simpler to update the Mozilla one to build on OS/2.
Libvpx 1.6.1 is ported to OS/2 and released as RPM. Firefox requires version 1.5.0 or above so now it can be built with --with-system-libvpx
(a libvpx-devel
installation is required of course). The Firefox build now goes much further. The next stop is minor build breaks in OS/2-related code regarding MIME and stuff.
Ok, after 59fe307c2e0aca7bc4fcb68687a2802f7a2c0f11 I finally get to the link stage. Some unresolved exports. Kinda expected. We are really close.
With a few more fixes (to be committed) building js.exe fails with this:
weakld: D:\Coding\mozilla\master-build\js\src\shell\Unified_cpp_js_src_shell0.obj - error: Invalid WKEXT record.
This message looks weird, never seen it before.
On 02/10/17 12:22 PM, Dmitriy Kuminov wrote:
With a few more fixes (to be committed) building js.exe fails with this:
|weakld: D:\Coding\mozilla\master-build\js\src\shell\Unified_cpp_js_src_shell0.obj - error: Invalid WKEXT record. |
This message looks weird, never seen it before.
Google returns a few hits that may be relevant.
I will have to dig into weakld I suppose.
When linking XUL.DLL I get the same error:
weakld: D:\Coding\mozilla\master-build\js\src\js_static.lib(D:\Coding\mozilla\master-build\js\src\Unified_cpp_js_src0.cpp) - error: Invalid WKEXT record.
And it's again the JS code. This may be a hint.
I changed the line, FILES_PER_UNIFIED_FILE = 8 to FILES_PER_UNIFIED_FILE = 1 in js/src/moz.build and now get this error,
In file included from C:/work/cc45esr/mozilla/js/src/jsfun.h:14:0,
253:39.74 from C:/work/cc45esr/mozilla/js/src/vm/Stack.h:15,
253:39.74 from C:/work/cc45esr/mozilla/js/src/vm/Probes.h:14,
253:39.74 from C:/work/cc45esr/mozilla/js/src/builtin/Profilers.cpp:28:
253:39.74 C:/work/cc45esr/mozilla/js/src/jsobj.h: In instantiation of 'bool JSObject::is() const [with T = js::ModuleObject]':
253:39.74 C:/work/cc45esr/mozilla/js/src/frontend/ParseNode.h:1714:58: required from here
253:39.74 C:/work/cc45esr/mozilla/js/src/jsobj.h:543:51: error: incomplete type 'js::ModuleObject' used in nested name specifier
253:39.74 inline bool is() const { return getClass() == &T::class_; }
253:39.74 ^
253:40.00
253:40.00 In the directory C:/work/cc45esr/mozilla/obj-fb/js/src
253:40.00 The following command failed to execute properly:
253:40.00 c++ -o Profilers.obj -c -DEXPORT_JS_API -DJS_HAS_CTYPES -DDLL_PREFIX="" -DDLL_SUFFIX=".dll" -DFFI_BUILDING -IC:/work/cc45esr/mozilla/js/src -I. -Ictypes/libffi/include -I../../dist/include -I/@unixroot/usr/include/nspr4 -DMOZILLA_CLIENT -include ../../js/src/js-confdefs.h -Uunix -U__unix -U__unix__ -MD -MP -MF .deps/Profilers.obj.pp -idirafter g:/OS2TK45/h -Wall -Wsign-compare -Wtype-limits -Wno-invalid-offsetof -Wcast-align -Zomf -fno-rtti -fno-exceptions -fno-math-errno -std=gnu++0x -pthread -DNDEBUG -DTRIMMED -g -mtune=generic -march=i686 -O3 -fomit-frame-pointer C:/work/cc45esr/mozilla/js/src/builtin/Profilers.cpp
253:40.03 make.EXE[5]: *** [Profilers.obj] Error 1
253:40.03 make.EXE[5]: *** Waiting for unfinished jobs....
253:53.51 make.EXE[4]: *** [js/src/target] Error 2
'''
Sure taking a long time to build :)
Turns out that the link failure is due to this def in weakld.c
(part of EMXOMFLD.EXE)
#define OMF_GETINDEX() (*u.puch & 0x80 ? ((*u.pch++ & 0x7f) << 8) + *u.pch++ : *u.pch++)
being:
int
if the first byte has a 0x80 bit and the second byte has it as well.Due to these problems, a weak symbols's index was computed as a negative value out of index table bounds, hence the Invalid WKEXT record
error.
With this version of the macro:
#define OMF_GETINDEX() (*u.puch & 0x80 ? ((*u.pch++ & 0x7f) << 8) + *u.puch++ : *u.pch++)
and with EMXOMFLD.EXE built by GCC 4.9.2 I get past this Invalid WKEXT record
error. There are now some missing OS/2-specific exports but that's a different story. Will fix that.
Got JS.EXE finally built (this means the static JS library, used to be MOZJS.DLL and now living in XUL.DLL) fully links now. The related changeset is 1b4fff0041ffbf714855e9e3eeee2d37414dfbef.
Building XUL.DLL now. Got to the end but there are some nasty unresolved symbols. _atomic_load_8
and friends is one of them. Seems that new code uses 64-bit atomic primitives which our 32-bit GCC fails to provide (may be a separate command line option is needed).
Seems that -march=i686
solves the problem with missing 64-bit atomics.
Faced another strange issue: EMX <sys/param.h>
defines BSD
on OS/2 (mimics the OS/2 Toolkit header's behaviour). This fools LIBICU's <unicode/platform.h>
which starts thinking it's running on a BSD system rather than on OS/2 which results in weird linking errors (because UChar becomes unsigned short rather than wchar_t and this is vital for C++ functions due to different codes these name have in mangling; for example I got an (undefined) reference to __ZN8ICUUtils24AssignUCharArrayToStringEPtiR18nsAString_internal
instead of __ZN8ICUUtils24AssignUCharArrayToStringEPwiR18nsAString_internal
— go note the difference!).
As this BSD
define is not actually used by EMX headers themselves and is only relevant for the version of the TCP/IP API EMX headers define, I guess it's safe to drop it from there. It may affect some old software using this define to decide on the available TCP/IP functionality but I don't think that we are likely to face such a case in real life. I commented out the entire BSD define block in <sys/param.h>
and will leave it as is for now — we will see where it gets us. I will release this change as a new kLIBC RPM if nothing bad happens.
With eca966a1b515c9990caf206b11eb12c9a3cc3921 XUL.DLL finally builds. Now I need a full rebuild due to header and march changes.
Everything is built. However, I get an assertion at startup:
[5909] ###!!! ABORT: Could not initialize gfxPlatformFontList: file D:/Coding/mozilla/master/gfx/thebes/gfxPlatform.cpp, line 578
There was a significant change, they renamed gfxPangoFonts that we used on OS/2 to gfxFontconfigFonts. I ported it when merging and compiling but apparently some bits are still missing.
What intl parameters are you passing to configure? Here, with --wih-intl-api and --with-system-icu my build (SM and FF) dies with
303:04.38 weakld: error: Unresolved symbol (UNDEF) '__ZN8ICUUtils24AssignUCharArrayToStringEPtiR18nsAString_internal'.
303:04.38 weakld: info: The symbol is referenced by:
303:04.38 C:\work\cc45esr\mozilla\obj-fb\netwerk\dns\Unified_cpp_netwerk_dns0.obj
Seems that intl\unicharutil\util\ICUUtils.cpp isn't getting built though objs.mozbuild in the same subdirectory has it under if CONFIG['ENABLE_INTL_API'] and ENABLE_INTL_API seems to be defined.
Dave, looks like you haven't read https://github.com/bitwiseworks/mozilla-os2/issues/188#issuecomment-294180407. Please do.
On 04/16/17 03:35 AM, Dmitriy Kuminov wrote:
Dave, looks like you haven't read #188 (comment) https://github.com/bitwiseworks/mozilla-os2/issues/188#issuecomment-294180407. Please do.
Actually I'd #if 0 that section and then forgot about it :) Now commented out.
New "big" problem: I desperately need the debug build (as it provides a lot of diagnostics and with the debug build I would have known about what exactly is broken in fonts much earlier) but WLINK fails to link XUL.DLL in debug mode with Error! E3009: dynamic memory exhausted
. My current machine only has 2GB of RAM, may be that's the reason. I need to see if the old behaviour can be enabled, when there were many separate DLLs.
You are not running out of memory because you only have 2GB of RAM. You are running out of memory because of the combination of your virtual address limit setting and the size of the debug data. Can you get by will something like -d1 or maybe -d2t?
The best long term solution would be to enhance the build system so that we could control which files get compiled with debug data. While it's true we never know beforehand which exactly debug data we will need, the reality is that we will never need the vast majority of the debug data. I think given our knowledge of what tends to fail in our ports, we could select a subset of debug data that would be good enough to handle the vast majority of failures we see in the field.
Another option is to rework wlink to spill debug data to disk as needed. However, this is a non-trivial task and we may still may run up against the OS/2 executable size limits.
On 04/17/17 08:16 AM, Dmitriy Kuminov wrote:
My current machine only has 2GB of RAM, may be that's the reason. I need to see if the old behaviour can be enabled, when there were many separate DLLs.
I've been building with only 2GBs of RAM, just meant a lot of swapping at times. The problem is address space, hopefully splitting out as much as possible from xul will work.
Hey Steven, thanks for the quick reaction. Address space limit. Ok, I see. I will try to play around with VAC. Just recalled that we already faced this issue back then. What is -d1 -d2t? If command line switches then to what? WL.EXE doesn't seem to understand them. Regarding your idea to compile only a subset of sources with debug info enabled. Well, this isn't practically possible with Mozilla due to the complexity of C++ classes, templates and so on. And yes, we never know where it traps upfront. So we need all debug info.
Dave, yes. I recalled now. And no it doesn't swap at all here.
BTW, the link time of XUL.DLL in debug mode is about 25 minutes here. This is the longest link operation I've ever seen. And this is given that all objects and libraries are already in OMF format. If I read it right, it's trying to link about 1000 object files and a hundred libraries. We should split XUL.DLL in pieces in debug mode only because of the build time. It's just ridiculous.
Another possible solution is increasing the size of the unified CPP sources, FILES_PER_UNIFIED_FILE = 8 in various moz.build files. I've seen comments in mozilla.dev.platform that even the 8 limit helps with linking as it works sorta like a prelinker.
We need to update the source base to ESR 45 first and then continue to add printing.