sem_* interception broken with mips glibc versioning

hauke commented 3 years ago

When I use the address sanitizer from GCC 10.2.0 with glibc 2.33 on MIPS32 BE on the example application from the sem_wait man page, it does not work correctly. https://man7.org/linux/man-pages/man3/sem_wait.3.html

The glibc for MIPS32 BE exports an old and a new versions of the sem_post() API call and only the new implementation of the sem_timedwait() call.

$ nm ../openwrt/build_dir/toolchain-mips_24kc_gcc-10.2.0_glibc/glibc-2.33-final/nptl/libpthread.so | grep sem_post
00013b00 t __new_sem_post
00013be0 t __old_sem_post
00013be0 T sem_post@GLIBC_2.0
00013b00 T sem_post@@GLIBC_2.2

$ nm ../openwrt/build_dir/toolchain-mips_24kc_gcc-10.2.0_glibc/glibc-2.33-final/nptl/libpthread.so | grep sem_timedwait
000134bc t __GI___sem_timedwait64
000135a0 t __sem_timedwait
000135a0 W sem_timedwait
000134bc t __sem_timedwait64

On x86_64 only one version is exported.

This is normally no problem because my application will use the new versions when I link it against a recent glibc. Without address sanitizer it works fine. When I link it with the address sanitizer the sanitizer will intercept the call and will call the old implementation for the sem_post() API, because there are two and the new one for the sem_timedwait(), because there is no other.

When I run this application linked with address sanitizer in strace I see that it uses the FUTEX_..._PRIVATE (private, not shared) call for the sem_timedwait() and the semwait() call is done with normal FUTEX (shared, not private). Mixing this is not supported by the Linux kernel, because this is part of the key which is used to find the threads which have to be woken up.

$ strace /semtest-mips-asan  1 10
....
futex_time64(0x411b80, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {tv_sec=1614724290, tv_nsec=757333762}, FUTEX_BITSET_MATCH_ANY) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
....
futex(0x411b80, FUTEX_WAKE, 1)          = 0
....

In this simple application the waiting thread is triggered, but not by the wake, but by the sig child from the dying child process. When I run this without address sanitizer both system calls use the FUTEX_*_PRIVATE call.

This is more or less the same problem as reported by @arichardson in #1371, just for a different glibc API function. The sanitizer uses the wrong symbol version for the sem_* functions. This often more or less works when all of them are from the same version, but when some are new (sem_timedwait()) and some are old (sem_post()) this causes problems because they interpret the data structure differently.

I tried the patches from #1371 and added this in addition:

--- a/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
+++ b/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc
@@ -6164,13 +6164,13 @@ INTERCEPTOR(int, sem_getvalue, __sanitizer_sem_t *s, int *sval) {
   return res;
 }
 #define INIT_SEM                                                               \
-  COMMON_INTERCEPT_FUNCTION(sem_init);                                         \
-  COMMON_INTERCEPT_FUNCTION(sem_destroy);                                      \
-  COMMON_INTERCEPT_FUNCTION(sem_wait);                                         \
-  COMMON_INTERCEPT_FUNCTION(sem_trywait);                                      \
+  COMMON_INTERCEPT_FUNCTION_GLIBC_VER_MIN(sem_init, "GLIBC_2.2");              \
+  COMMON_INTERCEPT_FUNCTION_GLIBC_VER_MIN(sem_destroy, "GLIBC_2.2");           \
+  COMMON_INTERCEPT_FUNCTION_GLIBC_VER_MIN(sem_wait, "GLIBC_2.2");              \
+  COMMON_INTERCEPT_FUNCTION_GLIBC_VER_MIN(sem_trywait, "GLIBC_2.2");           \
   COMMON_INTERCEPT_FUNCTION(sem_timedwait);                                    \
-  COMMON_INTERCEPT_FUNCTION(sem_post);                                         \
-  COMMON_INTERCEPT_FUNCTION(sem_getvalue);
+  COMMON_INTERCEPT_FUNCTION_GLIBC_VER_MIN(sem_post, "GLIBC_2.2");              \
+  COMMON_INTERCEPT_FUNCTION_GLIBC_VER_MIN(sem_getvalue, "GLIBC_2.2");          \
 #else
 #define INIT_SEM
 #endif // SANITIZER_INTERCEPT_SEM

But this didn't work for me, the application is failing like this:

$ ./semtest-mips-asan 1 10
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
=================================================================
==2547==ERROR: AddressSanitizer: SEGV on unknown address 0x00000000 (pc 0x779294ec bp 0x7fc1dd80 sp 0x772a3ca0 T0)
==2547==The signal is caused by a READ memory access.
==2547==Hint: address points to the zero page.
    #0 0x779294ec  (/lib/libasan.so.6+0xbd4ec)
    #1 0x77913e4c  (/lib/libasan.so.6+0xa7e4c)
    #2 0x7ff9b958  (linux-vdso.so.1+0x958)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/lib/libasan.so.6+0xbd4ec)
==2547==ABORTING

When it wants to call the original sem_init() it jumps to 0x0 address. I only see one call to dlvsym() for the pthreadcreate function in gdb, I would expect to see this call also for the sem* functions with my patch.

I want to use the recent version of this API by default with libsanitizer could one one fix this or tell me what could be wrong with my change?

arichardson commented 3 years ago

That sounds like dlsym() failed for some reason? I wonder if you could rebuild with the COMPILER_RT_DEBUG cmake option?

hauke commented 3 years ago

I am using the sanitizers with gcc and not with llvm. Currently OpenWrt, which I am using, only supports GCC and not LLVM, what should be the equivalent option for GCC? I can also add some other debug output to GCC.

TXTT2016 commented 1 year ago

Which gcc version support asan for MIPS64？

GeekOffTheStreet commented 11 months ago

I'm seeing a similar issue using Ubuntu 22.04 with gcc-12 and clang-17 packages when using asan on multilib (32-bit builds). sem_timedwait will never wakeup. 32-bit builds without asan are fine and building natively for x86_64 with asan also works.

Couple traces showing mix of old and new:

    frame #0: 0xf7fc5129 [vdso]`__kernel_vsyscall + 9
    frame #1: 0xf7a23366 libc.so.6`__libc_do_syscall at libc-do-syscall.S:41
    frame #2: 0xf7990e81 libc.so.6`__futex_abstimed_wait_common at futex-internal.c:40:12
    frame #3: 0xf7990e40 libc.so.6`__futex_abstimed_wait_common(futex_word=0xf6a043a4, expected=1, clockid=<unavailable>, abstime=0xf72e8cbc, private=0, cancel=true) at futex-internal.c:99:11
    frame #4: 0xf7990fff libc.so.6`__GI___futex_abstimed_wait_cancelable64(futex_word=<unavailable>, expected=<unavailable>, clockid=<unavailable>, abstime=<no summary available>, private=<no summary available>) at futex-internal.c:139:10 [artificial]
    frame #5: 0xf799d031 libc.so.6`do_futex_wait(sem=<unavailable>, abstime=<unavailable>, clockid=<unavailable>) at sem_waitcommon.c:116:9
    frame #6: 0xf799d0d9 libc.so.6`__new_sem_wait_slow64(sem=0xf6a043a4, abstime=0xf72e8cbc, clockid=<unavailable>) at sem_waitcommon.c:284:14
    frame #7: 0xf799d213 libc.so.6`___sem_timedwait [inlined] ___sem_timedwait64(abstime=0xf72e8cbc, sem=0xf6a043a4) at sem_timedwait.c:40:12
    frame #8: 0xf799d208 libc.so.6`___sem_timedwait [inlined] ___sem_timedwait64(abstime=0xf72e8cbc, sem=0xf6a043a4) at sem_timedwait.c:26:1
    frame #9: 0xf799d208 libc.so.6`___sem_timedwait(sem=0xf6a043a4, abstime=0xf7230890) at sem_timedwait.c:55:10
    frame #10: 0x56aee191 ut_orca_mt`__interceptor_sem_timedwait + 145

and:

  * frame #0: 0xf7fc5129 [vdso]`__kernel_vsyscall + 9
    frame #1: 0xf7a8249c libc.so.6`__old_sem_wait(sem=0xf6a04054) at sem_wait.c:65:13
    frame #2: 0x56aee0b2 ut_orca_mt`__interceptor_sem_wait + 50

google / sanitizers

sem_* interception broken with mips glibc versioning #1380