Closed dmik closed 4 years ago
It looks like GCC 4 tries to use MMX commands on a static array whose entries are (of course) not 16-byte aligned as it's a byte array. I need to figure out how to fix that, it's a past-0.1.0 task. Will release a "i686" build only (which will in fact use -mtune=pentium
as it has always been). Once it's fixed, it will be -march=i686
and -march=pentium4
as it is supposed to be.
Note that it might also be a compiler bug (or at least something related), see here https://github.com/psmedley/gcc/issues/28.
I've just tried building it with -O2 -march=pentium4
and it seems to work. The resulting DLL is another 200k smaller (it's just 987K now). I'm testing it locally now.
Some more info. -O2
may actually generate MMX commands as well so that they will trap like below. So it's not a solution. The only solution when using -march=pentium4
is to also completely disable MMX/SSE with -mno-sse
. But in this case the code is very similar to -march=i686
so I'm not sure it makes much sense. The only way to properly fix is to build a newer GCC version it seems.
______________________________________________________________________
Exception Report - created 2019/02/19 12:23:05
______________________________________________________________________
OS2/eCS Version: 2.45
# of Processors: 4
Physical Memory: 3260 mb
Virt Addr Limit: 1536 mb
Exceptq Version: 7.11.3-shl (Jul 5 2016)
______________________________________________________________________
Exception C0000005 - Access Violation
______________________________________________________________________
Process: C:\USR\BIN\FIREFOX.EXE (05/21/2018 19:17:07 51,129)
PID: 58 (88)
TID: 0B (11)
Priority: 200
Filename: C:\USR\LIB\LIBCN0.DLL (02/15/2019 19:02:07 986,622)
Address: 005B:1FD199E5 (0001:000299E5)
Cause: Unknown access fault
______________________________________________________________________
Failing Instruction
______________________________________________________________________
1FD199CF MOV [EBP-0x2024], EAX (8985 dcdfffff)
1FD199D5 LEA EAX, [EBP-0x818] (8d85 e8f7ffff)
1FD199DB MOV [EBP-0x201c], EAX (8985 e4dfffff)
1FD199E1 PXOR XMM0, XMM0 (660fefc0)
1FD199E5 >MOVDQA DQWORD [EBP-0x2048], XMM0 (660f7f85 b8dfffff)
1FD199ED FLD TBYTE [EBP+0x10] (db6d 10)
1FD199F0 FLD ST(0) (d9c0)
1FD199F2 FSTP TBYTE [EBP-0x2048] (dbbd b8dfffff)
______________________________________________________________________
Registers
______________________________________________________________________
EAX : 0368DB3C EBX : 0368E480 ECX : 00000065 EDX : 00000000
ESI : 0368E480 EDI : 0368E395
ESP : 0368C2CC EBP : 0368E354 EIP : 1FD199E5 EFLG : 00010216
CS : 005B CSLIM: FFFFFFFF SS : 0053 SSLIM: FFFFFFFF
EAX : read/write memory on this thread's stack
EBX : read/write memory on this thread's stack
ECX : not a valid address
EDX : not a valid address
ESI : read/write memory on this thread's stack
EDI : read/write memory on this thread's stack
______________________________________________________________________
Stack Info for Thread 0B
______________________________________________________________________
Size Base ESP Max Top
00200000 03690000 -> 0368C2CC -> 0368A000 -> 03490000
______________________________________________________________________
Call Stack
______________________________________________________________________
EBP Address Module Obj:Offset Nearest Public Symbol
-------- --------- -------- ------------- -----------------------
Trap -> 1FD199E5 LIBCN0 0001:000299E5 legacy-dtoa.c#48 ___legacy_dtoa + 49 0001:0002999C (legacy-dtoa.obj)
Offset Name Type Hex Value
ÄÄÄÄÄÄ ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ ÄÄÄÄÄÄÄÄÄ
8 buffer pointer to 8 bit unsigned 368E395
12 p_exp pointer to 32 bit signed 368E390
16 x 80 bit real -0.000000
28 ndigits 32 bit signed 6
32 fmt 32 bit signed 2
36 dig 32 bit signed F
-8200 r__v 0x202 8247C89
-8232 r 0x203 FE07DBAA
-6152 s__v 0x202 64
-8224 s 0x203 E8000000
-4104 m__v 0x202 B
-8216 m 0x203 848BC389
-2056 tmp__v 0x202 2D793458
-8208 tmp 0x203 8508D00
0368E354 1FD3E578 LIBCN0 0001:0004E578 _output.c#718 __output - 3D4 0001:0004E94C (_output.obj)
0368E3C4 1FD3E834 LIBCN0 0001:0004E834 _output.c#796 __output - 118 0001:0004E94C (_output.obj)
0368E434 1FD3F07F LIBCN0 0001:0004F07F _output.c#1109 __output + 733 0001:0004E94C (_output.obj)
Offset Name Type Hex Value
ÄÄÄÄÄÄ ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ ÄÄÄÄÄÄÄÄÄ
8 stream pointer to type 0x214 368E4D8
12 format 8 bit unsigned 368E55C
16 arg_ptr pointer to 8 bit unsigned 368E554
-36 v 0x22C 6
-60 wc 16 bit unsigned DFFA
-60 wc 16 bit unsigned DFFA
-60 buf 0x238 1625DFFA
-60 c 8 bit unsigned FA
-60 buf 0x238 1625DFFA
-60 buf 0x239 1625DFFA
-60 buf 0x23A 1625DFFA
-60 buf 0x210 1625DFFA
0368E4B4 1FD5BF9E LIBCN0 0001:0006BF9E vsnprint.c#32 __std_vsnprintf + CE 0001:0006BED0 (vsnprint.obj)
Offset Name Type Hex Value
ÄÄÄÄÄÄ ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ ÄÄÄÄÄÄÄÄÄ
8 buffer pointer to 8 bit unsigned 368E570
12 n 32 bit unsigned 12C
16 format 8 bit unsigned 368E55C
20 arg_ptr pointer to 8 bit unsigned 368E554
-76 trick 0x203 6000044
0368E524 16075A44 XUL 0001:024A5A44
On 02/19/19 09:42 AM, Dmitriy Kuminov wrote:
But in this case the code is very similar to |-march=i686| so I'm not sure it makes much sense.
Well the netburst architectures does like different instruction ordering compared to all the other i386 architectures to keep the pipeline full.
The only way to properly fix is to build a newer GCC version it seems.
I've had fairly good results with Paul's build of GCC 5.5.0 though -O3 is still problematic with -march=pentium-m (i686+mmx+sse2). Also seems to have some better support for some of the wchar stuff. IIRC, some of the Mozilla developers considered that the GCC 6.x branch optimized too aggressively causing problems compared to the GCC 7.x branch if moving beyond the 5.x branch.
Yes, I noticed some different instruction ordering too. The question is still if it makes any siginificant difference at runtime. I guess some good test case is needed to answer this question.
Regarding GCC, our plans are to move right to to GCC 8. Did you hear anything wrt this version from the Moz devs?
Haven't heard anything about GCC 8.
I've tried rebuilding LIBC with GCC 9, with -march=pentium4
and without -mno-sse
and I don't see emxomf.exe
crashes any mroe (which isn't a surprise since GCC 9 uses -mstackrealign
by default on OS/2 and emxomf
problems seem to be related to misaligned stack variables). I will give LIBC a run in this mode to see if there are any other problems in this build or not.
One unpleasant thing with how GCC fixes system headers (including LIBC ones) is that you have to rebuild GCC each time any of these headers change. Not smart.
I don't see any crashes of EMXOMF with GCC9. So I'm closing this. Note that it also means that the next LIBC RPM release will be dual platform as all other RPMS (i686/pentium4).
When doing #24, I discovered that
emxomf.exe
built with-march=pentium4
crashes on some .o files like this:The .TRP file is as follows (with irrelevant parts removed):