Closed dmik closed 5 years ago
There was a tiny bug in the code (false negative) that caused the cycle to spin forever. Fixed by the above commit. What puzzles me now is how it could ever work in GCC3. I've checked the assembly, it's essentially the same to the one generated by GCC4.
Note that this (previously faulty) code gets executed only when _atexit
and/or on_exit
run out of the initial number of the callback array (64) and want to register some more. So the only thing that comes to my mind is that for some reason GCC3 builds of LIBC register less atexit callbacks than those built with GCC4. Crazy but who knows. I'll try to check that with logging. And will also give Knut a link to this. May be something pops up in his head.
Okay, now I know exactly what's going on. Compare the GCC3 assembly (faulty code, w/o my fix):
call __hcalloc
movl %eax, %ecx
addl $16, %esp
xorl %eax, %eax
testl %ecx, %ecx
je L19
movl $1, 4(%ecx)
movl %ebx, 12(%ecx)
movl $3, 8(%ecx)
movl ___libc_gAtExitHead, %edx
L33:
movl %edx, (%ecx)
movl (%ecx), %eax
lock; cmpxchgl %ecx, ___libc_gAtExitHead
setz %al
movzx %al, %eax
testl %eax, %eax
jne L33
and the GCC4 assembly (same faulty code):
call __hcalloc
movl %eax, %edx
testl %eax, %eax
je L23
movl $1, 4(%eax)
movl %ebx, 12(%eax)
movl $3, 8(%eax)
L10:
movl ___libc_gAtExitHead, %eax
movl %eax, (%edx)
movl (%edx), %eax
lock; cmpxchgl %edx, ___libc_gAtExitHead
setz %al
movzx %al, %eax
testl %eax, %eax
jne L10
GCC3 doesn't reload ___libc_gAtExitHead
in the loop because it (mistakenly) thinks its value never changes within the loop. As a result, the second loop iteration breaks it since __atomic_cmpxchg32
returns false as the new value of ___libc_gAtExitHead
doesn't match the initial one (stored in EDX).
GCC4, however, properly guesses that ___libc_gAtExitHead
might be changed because it is involved in assembly marked with volatile
and reloads it at the beginning of each iteration. SInce at the second iteration the new value will always match what was set on the previous iteration, __atomic_cmpxchg32
will always return true and the loop will never end. Hence the hang at startup.
So it's a really weird combination of the program bug and the compiler bug that canceled each other and all accidentially worked. Having the compiler bug went away, it broke. Cool, I like such things.
Anyway, case closed. The fix makes it work in GCC4. What about GCC3, this fix actually creates the opposite potential problem: if the thread is not able to set ___libc_gAtExitHead at the first attempt because some other thread was faster, it will hang forever. This never happened in the past because it's a very rare case. Atexit handlers are usually installed at startup when there are not many threads.
BTW, JFTR, I measured: a Qt4 application installs ~90 atexit callbacks (i.e. more than the initial room for 64) regardless of GCC3 or GCC4. Regular applications install less than 64. And this explains why only Qt4 apps would hang with GCC4 builds of LIBC prior to the fix.
From https://github.com/bitwiseworks/libc/issues/4#issuecomment-456932815:
and