bombela / backward-cpp

A beautiful stack trace pretty printer for C++
MIT License
3.81k stars 479 forks source link

Undefined behavior in signal handler? #297

Open linusmartensson opened 1 year ago

linusmartensson commented 1 year ago

Hi @bombela and maintainers in backward-cpp!

I'm working with an issue in a bit of a complicated setup where I'm trying to handle error printing in a multithreaded app using backward-cpp. This errors triggers here: https://github.com/bombela/backward-cpp/blob/dc8b8c76822dcbc8918f032171050ad90e11eb7f/backward.hpp#L4298

Specfically, it looks like this at the tip of the stack:

#11   Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7fffdbe4251f, in 
#10   Source "../lib/backward.hpp", line 4299, in sig_handler [0x5555557118cf]
       4296:             handleSignal(signo, info, _ctx);
       4297: 
       4298:             // try to forward the signal.
      >4299:             raise(info->si_signo);
       4300: 
       4301:             // terminate the process immediately.
       4302:             puts("watf? exit");
#9    Source "../sysdeps/posix/raise.c", line 26, in raise [0x7fffdbe42475]
#8  | Source "./nptl/pthread_kill.c", line 89, in __pthread_kill_internal
    | Source "./nptl/pthread_kill.c", line 78, in __pthread_kill_implementation
      Source "./nptl/pthread_kill.c", line 43, in __pthread_kill [0x7fffdbe96a7b]
#7    Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7fffdbe4251f, in 
#6    Source "../lib/backward.hpp", line 4299, in sig_handler [0x5555557118cf]
       4296:             handleSignal(signo, info, _ctx);
       4297: 
       4298:             // try to forward the signal.
      >4299:             raise(info->si_signo);
       4300: 
       4301:             // terminate the process immediately.
       4302:             puts("watf? exit");
#5    Source "../sysdeps/posix/raise.c", line 26, in raise [0x7fffdbe42475]
#4  | Source "./nptl/pthread_kill.c", line 89, in __pthread_kill_internal
    | Source "./nptl/pthread_kill.c", line 78, in __pthread_kill_implementation
      Source "./nptl/pthread_kill.c", line 43, in __pthread_kill [0x7fffdbe96a7b]
#3    Object "/usr/lib/x86_64-linux-gnu/libc.so.6", at 0x7fffdbe4251f, in 
#2    Source "../lib/backward.hpp", line 4299, in sig_handler [0x5555557118cf]
       4296:             handleSignal(signo, info, _ctx);
       4297: 
       4298:             // try to forward the signal.
      >4299:             raise(info->si_signo);
       4300: 
       4301:             // terminate the process immediately.
       4302:             puts("watf? exit");
#1    Source "../sysdeps/posix/raise.c", line 26, in raise [0x7fffdbe42475]
#0  | Source "./nptl/pthread_kill.c", line 89, in __pthread_kill_internal
    | Source "./nptl/pthread_kill.c", line 78, in __pthread_kill_implementation
      Source "./nptl/pthread_kill.c", line 44, in __pthread_kill [0x7fffdbe96a7c]

si_signo in this specific case is SIGSEGV, and I'm expecting the program to terminate. Instead it's infinitely recursing in a signal handler, where it looks like a segfault is occurring in __pthread_kill_internal, as a result of the exception being raised from within the exception handler. While this is SIGSEGV, I'm pretty sure the same behavior occurs with other signals. The original exception source is a background thread: Thread 26 "[core] bkg" received signal SIGSEGV, Segmentation fault.

The initial occurrence of SIGSEGV is correct behavior, as I am committing an access violation to trigger this behavior. The issue at hand is the infinite loop occurring after it (and I've seen it in other circumstances before.)

Looking at cppreference.com (https://en.cppreference.com/w/c/program/signal), I'm finding this quote: If the signal handler is called as a result of [abort] or [raise], the behavior is undefined if the signal handler calls [raise].

Is it viable to remove this raise() call, or should there be some special consideration when the exception is occurring off the main thread?

linusmartensson commented 1 year ago

Actually, this might be that another library, our embedded javascript engine, is causing the issue by registering a secondary signal handler and calling it when the engine doesn't have an active context. I'm a bit uncertain - but the issue almost looks like there's some kind of pingpong going on between a valid and invalid error handler:

    at /lib/x86_64-linux-gnu/libc.so.6
#12 __pthread_kill_implementation (no_tid=0, signo=11, threadid=140733503758336)
    at ./nptl/pthread_kill.c:44
#13 __pthread_kill_internal (signo=11, threadid=140733503758336)
    at ./nptl/pthread_kill.c:78
#14 __GI___pthread_kill (threadid=140733503758336, signo=signo@entry=11)
    at ./nptl/pthread_kill.c:89
#15 0x00007fffdbe42476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
#16 0x00005555557118d0 in backward::SignalHandling::sig_handler(int, siginfo_t*, void*) (signo=<optimized out>, info=0x7fff127d1c30, _ctx=<optimized out>)
    at ../lib/backward.hpp:4300
#17 0x00007fffdbe42520 in <signal handler called> ()
    at /lib/x86_64-linux-gnu/libc.so.6
#18 __pthread_kill_implementation (no_tid=0, signo=11, threadid=140733503758336)
    at ./nptl/pthread_kill.c:44
#19 __pthread_kill_internal (signo=11, threadid=140733503758336)
    at ./nptl/pthread_kill.c:78
#20 __GI___pthread_kill (threadid=140733503758336, signo=signo@entry=11)
    at ./nptl/pthread_kill.c:89
#21 0x00007fffdbe42476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
#22 0x00005555557118d0 in backward::SignalHandling::sig_handler(int, siginfo_t*, void*) (signo=<optimized out>, info=0x7fff127d29f0, _ctx=<optimized out>)
    at ../lib/backward.hpp:4300
#23 0x00007fffdbe42520 in <signal handler called> ()
    at /lib/x86_64-linux-gnu/libc.so.6
#24 __pthread_kill_implementation (no_tid=0, signo=11, threadid=140733503758336)
    at ./nptl/pthread_kill.c:44
#25 __pthread_kill_internal (signo=11, threadid=140733503758336)
    at ./nptl/pthread_kill.c:78
#26 __GI___pthread_kill (threadid=140733503758336, signo=signo@entry=11)
    at ./nptl/pthread_kill.c:89
#27 0x00007fffdbe42476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
#28 0x00005555558d3435 in ...::call_raise_sigsegv() (this=0x555557292ca0)
    at ../src/...

Would it be worthwhile (or even possible considering exception handling rules) to detect when sig_handler is already in the stack and if so not do this? :smile:

bombela commented 1 year ago

The ping pong is likely because you do have multiple signal handlers. raise works only once I think? You can subclass and override the behavior. Or just write your own signal handler class of course. This one is merely a reasonable default to start with.

It would be possible to modify the code to not raise on SIGABRT. Then, not calling raise recursively is much harder, if not downright impossible.

The reason for calling raise btw, is so that if you have your own signal handler, you could recover.

For example, consider a segfault, in a thread, and you want to catch it and do something about it. backward-cpp will print a trace, then your very own handler can recover the situation.

Otherwise, SignalHandling would terminate your application without your control. With that said, again, it is mostly a default/example to get started, and you shouldn't be afraid of writing your own or subclassing it.

linusmartensson commented 1 year ago

Great input, thanks! I'll set up a subclass and solve it that way. 😁

Den tis 16 maj 2023 07:09François-Xavier Bourlet @.***> skrev:

The ping pong is likely because you do have multiple signal handlers. raise works only once I think? You can subclass and override the behavior. Or just write your own signal handler class of course. This one is merely a reasonable default to start with.

It would be possible to modify the code to not raise on SIGABRT. Then, not calling raise recursively is much harder, if not downright impossible.

The reason for calling raise btw, is so that if you have your own signal handler, you could recover.

For example, consider a segfault, in a thread, and you want to catch it and do something about it. backward-cpp will print a trace, then your very own handler can recover the situation.

Otherwise, SignalHandling would terminate your application without your control. With that said, again, it is mostly a default/example to get started, and you shouldn't be afraid of writing your own or subclassing it.

— Reply to this email directly, view it on GitHub https://github.com/bombela/backward-cpp/issues/297#issuecomment-1548999032, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADOMSLN7F3SVHANESTWPK3XGMDY5ANCNFSM6AAAAAAYCLKIZY . You are receiving this because you authored the thread.Message ID: @.***>