Open hauke opened 3 years ago
That sounds like dlsym() failed for some reason? I wonder if you could rebuild with the COMPILER_RT_DEBUG cmake option?
I am using the sanitizers with gcc and not with llvm. Currently OpenWrt, which I am using, only supports GCC and not LLVM, what should be the equivalent option for GCC? I can also add some other debug output to GCC.
Which gcc version support asan for MIPS64?
I'm seeing a similar issue using Ubuntu 22.04 with gcc-12 and clang-17 packages when using asan on multilib (32-bit builds). sem_timedwait will never wakeup. 32-bit builds without asan are fine and building natively for x86_64 with asan also works.
Couple traces showing mix of old and new:
frame #0: 0xf7fc5129 [vdso]`__kernel_vsyscall + 9
frame #1: 0xf7a23366 libc.so.6`__libc_do_syscall at libc-do-syscall.S:41
frame #2: 0xf7990e81 libc.so.6`__futex_abstimed_wait_common at futex-internal.c:40:12
frame #3: 0xf7990e40 libc.so.6`__futex_abstimed_wait_common(futex_word=0xf6a043a4, expected=1, clockid=<unavailable>, abstime=0xf72e8cbc, private=0, cancel=true) at futex-internal.c:99:11
frame #4: 0xf7990fff libc.so.6`__GI___futex_abstimed_wait_cancelable64(futex_word=<unavailable>, expected=<unavailable>, clockid=<unavailable>, abstime=<no summary available>, private=<no summary available>) at futex-internal.c:139:10 [artificial]
frame #5: 0xf799d031 libc.so.6`do_futex_wait(sem=<unavailable>, abstime=<unavailable>, clockid=<unavailable>) at sem_waitcommon.c:116:9
frame #6: 0xf799d0d9 libc.so.6`__new_sem_wait_slow64(sem=0xf6a043a4, abstime=0xf72e8cbc, clockid=<unavailable>) at sem_waitcommon.c:284:14
frame #7: 0xf799d213 libc.so.6`___sem_timedwait [inlined] ___sem_timedwait64(abstime=0xf72e8cbc, sem=0xf6a043a4) at sem_timedwait.c:40:12
frame #8: 0xf799d208 libc.so.6`___sem_timedwait [inlined] ___sem_timedwait64(abstime=0xf72e8cbc, sem=0xf6a043a4) at sem_timedwait.c:26:1
frame #9: 0xf799d208 libc.so.6`___sem_timedwait(sem=0xf6a043a4, abstime=0xf7230890) at sem_timedwait.c:55:10
frame #10: 0x56aee191 ut_orca_mt`__interceptor_sem_timedwait + 145
and:
* frame #0: 0xf7fc5129 [vdso]`__kernel_vsyscall + 9
frame #1: 0xf7a8249c libc.so.6`__old_sem_wait(sem=0xf6a04054) at sem_wait.c:65:13
frame #2: 0x56aee0b2 ut_orca_mt`__interceptor_sem_wait + 50
See also this other report on SO: https://stackoverflow.com/questions/75005217/linux-32bit-compiled-sem-timedwait-example-with-small-mod-fails-on-64-bit-wh
When I use the address sanitizer from GCC 10.2.0 with glibc 2.33 on MIPS32 BE on the example application from the sem_wait man page, it does not work correctly. https://man7.org/linux/man-pages/man3/sem_wait.3.html
The glibc for MIPS32 BE exports an old and a new versions of the sem_post() API call and only the new implementation of the sem_timedwait() call.
On x86_64 only one version is exported.
This is normally no problem because my application will use the new versions when I link it against a recent glibc. Without address sanitizer it works fine. When I link it with the address sanitizer the sanitizer will intercept the call and will call the old implementation for the sem_post() API, because there are two and the new one for the sem_timedwait(), because there is no other.
When I run this application linked with address sanitizer in strace I see that it uses the FUTEX_..._PRIVATE (private, not shared) call for the sem_timedwait() and the semwait() call is done with normal FUTEX (shared, not private). Mixing this is not supported by the Linux kernel, because this is part of the key which is used to find the threads which have to be woken up.
In this simple application the waiting thread is triggered, but not by the wake, but by the sig child from the dying child process. When I run this without address sanitizer both system calls use the FUTEX_*_PRIVATE call.
This is more or less the same problem as reported by @arichardson in #1371, just for a different glibc API function. The sanitizer uses the wrong symbol version for the sem_* functions. This often more or less works when all of them are from the same version, but when some are new (sem_timedwait()) and some are old (sem_post()) this causes problems because they interpret the data structure differently.
I tried the patches from #1371 and added this in addition:
But this didn't work for me, the application is failing like this:
When it wants to call the original sem_init() it jumps to 0x0 address. I only see one call to dlvsym() for the pthreadcreate function in gdb, I would expect to see this call also for the sem* functions with my patch.
I want to use the recent version of this API by default with libsanitizer could one one fix this or tell me what could be wrong with my change?