theRockLiu / thread-sanitizer

Automatically exported from code.google.com/p/thread-sanitizer
0 stars 0 forks source link

TSAN does not handle siglongjmp(3) jumping out of signal handler #75

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
short story:

TSAN crashes on my program involving synchronous SIGSEGV handler and 
siglongjmp() jumping out of it with the following message:

    FATAL: ThreadSanitizer CHECK failed: .../tsan_interceptors.cc:1644 "((thr->in_signal_handler)) == ((false))" (0x1, 0x0)

details below:

What steps will reproduce the problem?

1. Compile attached tsansig.c with -fsanitize=thread, e.g.
    $ gcc -g -Wall -fsanitize=thread -pie -fPIC tsansig.c
    $ clang -g -Wall -fsanitize=thread tsansig.c

2. Run the program
    $ ./a.out

What is the expected output? What do you see instead?

expected output is (does happen without sanitizing at all, and with ASAN)

    ((int *)NULL)[0] = 0 faulted ok
    ((int *)NULL)[1] = 1 faulted ok
    ((int *)NULL)[3] = 1 faulted ok

with TSAN I get:

# gcc
((int *)NULL)[0] = 0 faulted ok
FATAL: ThreadSanitizer CHECK failed: 
../../../../src/libsanitizer/tsan/tsan_interceptors.cc:1644 
"((thr->in_signal_handler)) == ((false))" (0x1, 0x0)
    #0 <null> <null>:0 (libtsan.so.0+0x000000064c1c)
    #1 <null> <null>:0 (libtsan.so.0+0x000000064cf2)
    #2 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) <null>:0 (libtsan.so.0+0x000000069323)
    #3 <null> <null>:0 (libtsan.so.0+0x0000000330f2)
    #4 <null> <null>:0 (libc.so.6+0x0000000350ef)
    #5 __tsan_write4 <null>:0 (libtsan.so.0+0x00000002ec92)
    #6 main /home/kirr/tmp/trashme/tsansig.c:61 (a.out+0x000000000e5a)
    #7 __libc_start_main <null>:0 (libc.so.6+0x000000021b44)
    #8 <null> <null>:0 (a.out+0x000000000b38)
    #9 <null> <null>:0 (0x000000000000)

# clang
((int *)NULL)[0] = 0 faulted ok
FATAL: ThreadSanitizer CHECK failed: 
/tmp/buildd/llvm-toolchain-3.4-3.4.2/projects/compiler-rt/lib/tsan/rtl/tsan_inte
rceptors.cc:1644 "((thr->in_signal_handler)) == ((false))" (0x1, 0x0)
    #0 __tsan::PrintCurrentStackSlow() ??:0 (exe+0x00000009f9df)
    #1 __tsan::TsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) ??:0 (exe+0x00000009f9b3)
    #2 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) ??:0 (exe+0x0000000386a1)
    #3 rtl_sighandler(int) tsan_interceptors.o:0 (exe+0x000000055712)
    #4 _L_unlock_13 ??:0 (libpthread.so.0+0x00000000f8cf)
    #5 __tsan_write4 ??:0 (exe+0x00000009bb47)
    #6 main /home/kirr/tmp/trashme/tsansig.c:61 (exe+0x0000000a5539)
    #7 __libc_start_main /home/aurel32/glibc/glibc-2.19/csu/libc-start.c:287 (libc.so.6+0x000000021b44)
    #8 _start ??:0 (exe+0x0000000a51bc)
    #9 <null> <null>:0 (0x000000000000)

What version of the product are you using? On what operating system?

OS is Debian GNU/Linux testing on x86_64 with current gcc and clang:

$ uname -a
Linux teco 3.14-2-amd64 #1 SMP Debian 3.14.15-2 (2014-08-09) x86_64 GNU/Linux

$ gcc --version
gcc (Debian 4.9.1-11) 4.9.1

$ clang --version
Debian clang version 3.4.2-8 (tags/RELEASE_34/dot2-final) (based on LLVM 3.4.2)
Target: x86_64-pc-linux-gnu
Thread model: posix

Please provide any additional information below.

The problem, it seems, is caused by that siglongjmp interceptor does not adjust 
thr->in_signal_handler to state that could be saved in sigsetjmp interceptor.

As a result rtl_generic_sighandler() first increments thr->in_signal_handler, 
then calls the handler, then wants to decrement thr->in_signal_handler, but 
oops, if we jump out of signal handler, thr->in_signal_handler is left not 
decremented.

And on next entry to TSAN sighanl handler that check fires.

I can't say for sure whether git rev 4e992b94 (tsan: restructure signal 
handling to allow recursive handling, dvyukov, 2014-09-02) for compiler-rt 
fixed it, but it seems no, because it deals with orthogonal issue and LongJmp() 
code has not been changed.

Thanks beforehand,
Kirill

Original issue reported on code.google.com by kirill.s...@gmail.com on 10 Sep 2014 at 10:57

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks for reporting the issue.
I will test on tip and fix it if it is still broken.

Original comment by dvyu...@google.com on 10 Sep 2014 at 8:09

GoogleCodeExporter commented 9 years ago
Revision 217908 fixes this. I've added your program as test case for tsan.

Original comment by dvyu...@google.com on 16 Sep 2014 at 9:58

GoogleCodeExporter commented 9 years ago
Thanks. I've rebuild libtsan.so from today's compiler-rt and confirm that the 
test program now works (with my gcc-4.9).

Please note that the same approach should be taken wrt 
setcontext()/getcontext() when/if they are intercepted.
Also please find minor patch to correct reference to this issue from the test.

Thanks again,
Kirill

Original comment by kirill.s...@gmail.com on 17 Sep 2014 at 8:40

Attachments:

GoogleCodeExporter commented 9 years ago
Issue id is fixed in rev 217992.
setcontext()/getcontext can be messy to support, I would prefer to delay it 
until there is a real need.

Original comment by dvyu...@google.com on 17 Sep 2014 at 11:08

GoogleCodeExporter commented 9 years ago
Thanks.
I agree about setcontext() support.

Original comment by kirill.s...@gmail.com on 18 Sep 2014 at 7:40