Closed bszente closed 2 years ago
This is an issue with musl, not with libbacktrace. For this operation libbacktrace relies on the compiler support _Unwind_Backtrace
function. When using GCC and (probably) LLVM that function is able to unwind through a signal handler when using glibc. It appears that it is not able to unwind through a signal handler when using musl.
The relevant code on x86_64 is https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libgcc/config/i386/linux-unwind.h;h=6170a773f5f6602bdcd97c407ac4cf225b9b705c;hb=HEAD#l47 . It looks for a specific instruction sequence to recognize the signal handler. That is the exact instruction sequence used by glibc. Perhaps musl uses a different instruction sequence. If so, perhaps it could be changed so that this works.
There is no way to pass the ucontext
to backtrace_simple
, sorry.
Thank you very much for the useful hint!
Just a follow-up. I managed to obtain the callstack with musl as well. It was much simpler than I expected.
musl has exactly the same signal return trampoline as GLIBC or uClibc. The issue comes from the following line from libgcc/config/i386/linux-unwind.h
#if defined __GLIBC__ && !(__GLIBC__ == 2 && __GLIBC_MINOR__ == 0)
The signal frame decoding code path is enabled only for GLIBC. uClibc-ng works because it defines the above macros, pretending to be GLIBC.
musl does not define any __MUSL__ macro, so it is not possible to enable this code path conditionally. For this reason, I personally removed the above #if
line from libgcc
for my use case, to force the signal frame unwinding to work for musl as well:
Run the program:
$ ./test Inside main Inside function1 Inside function2 Inside do_invalid_access SIGSEGV: addr=0 code=1
ret=0
Decode the addresses:
$ x86_64-buildroot-linux-musl-addr2line -aipfC -e ./test ./test | grep ^# | cut -d ' ' -f2
0x00000000004011d2: crash_handler at /home/user/libbacktrace-test/test.c:27
0x000000000040655d: sigemptyset at /home/user/build-x86_64-2021.02.2/build/musl-1.2.2/src/signal/x86_64/restore.s:1
0x000000000040127f: do_invalid_access at /home/user/libbacktrace-test/test.c:48
0x00000000004012a4: function2 at /home/user/libbacktrace-test/test.c:53
0x00000000004012ba: function1 at /home/user/libbacktrace-test/test.c:58
0x0000000000401387: main at /home/user/libbacktrace-test/test.c:68
0x0000000000404a2e: libc_start_main_stage2 at /home/user/build-x86_64-2021.02.2/build/musl-1.2.2/src/env/__libc_start_main.c:94
0x0000000000401044: _start at ??:?
@ianlancetaylor thank you again for the link to the relevant code part.
Please consider the following test code:
For the sake of example, please ignore that
printf
is not safe to be called from a signal handler.Executing the following steps:
Compile and link fully static the
test.c
file with a musl based toolchain built using Buildroot 2021.02.2 with debug symbols enabled:$ x86_64-buildroot-linux-musl-gcc -static test.c -o test -g2 -lbacktrace
Run the program:
$ ./test Inside main Inside function1 Inside function2 Inside do_invalid_access SIGSEGV: addr=0 code=1
0x4011d2
0x4062a9
ret=0
Decode the addresses:
$ x86_64-buildroot-linux-musl-addr2line -aipfC -e ./test 0x4011d2 0x4062a9 0x00000000004011d2: crash_handler at /home/user/libbacktrace-test/test.c:27 0x00000000004062a9: sigemptyset at /home/user/build-x86_64-2021.02.2/build/musl-1.2.2/src/signal/x86_64/restore.s:1
As it can be seen, the backtrace stops in the
crash_handler
. There are no addresses above the signal frame.The very same binary in GDB has the following callstack in that point:
Inside main Inside function1 Inside function2 Inside do_invalid_access
Program received signal SIGSEGV, Segmentation fault. 0x000000000040127f in do_invalid_access (v=0x0) at test.c:48 48 v = v + 1; (gdb) cont Continuing. SIGSEGV: addr=0 code=1
Breakpoint 1, crash_handler (sig=11, info=0x7fffffffd130, ucontext=0x7fffffffd000) at test.c:27 27 ret = backtrace_simple(bstate, 0, bt_cb_simple, NULL, NULL); (gdb) bt
0 crash_handler (sig=11, info=0x7fffffffd130, ucontext=0x7fffffffd000) at test.c:27
1
2 0x000000000040127f in do_invalid_access (v=0x0) at test.c:48
3 0x00000000004012a5 in function2 () at test.c:53
4 0x00000000004012bb in function1 () at test.c:58
5 0x000000000040131e in main (argc=1, argv=0x7fffffffd728) at test.c:68
On the other hand, compiling the test application with GLIBC, the
backtrace_simple
call works properly:Compile and link fully static with GLIBC:
$ gcc -static test.c -o test -g2 -lbacktrace
Run the binary:
$ ./test Inside main Inside function1 Inside function2 Inside do_invalid_access SIGSEGV: addr=(nil) code=1
0x40184b
0x40e05f
0x40192d
0x401957
0x401972
0x4019da
0x408a47
0x4016c9
ret=0
Decode the addresses:
$ addr2line -aipfC -e ./test 0x40184b 0x40e05f 0x40192d 0x401957 0x401972 0x4019da 0x408a47 0x4016c9 0x000000000040184b: crash_handler at /home/user/libbacktrace-test/test.c:27 0x000000000040e05f: gsignal at ??:? 0x000000000040192d: do_invalid_access at /home/user/libbacktrace-test/test.c:48 0x0000000000401957: function2 at /home/user/libbacktrace-test/test.c:53 0x0000000000401972: function1 at /home/user/libbacktrace-test/test.c:58 0x00000000004019da: main at /home/user/libbacktrace-test/test.c:68 0x0000000000408a47: __libc_start_main at /var/tmp/portage/sys-libs/glibc-2.33-r7/work/glibc-2.33/csu/../csu/libc-start.c:332 0x00000000004016c9: _start at ??:?
Questions:
ucontext
tobacktrace_simple
somehow so it would unwind directly on the received context? I'm interested in this even if it is not a portable solution.Thank you!