Crash protection prevents host crashes, causing application to freeze

danielytics commented 3 years ago

I encountered a situation where the host crashed, but the cr crash handler caught it, causing the host to simply sit there doing nothing. From the outside, it looked like it had simply frozen.

If I enable trace logging, then I see that the signal handler is called in an infinite loop.

On Ubuntu Linux using Clang 11.0.0-2

Is there a way to avoid this?

fungos commented 3 years ago

Probably it would require to validate the module which crashed, and if it is not from a cr module, then let it crash normally, but I think it can easily become involved. Do you have the stack when frozen?

danielytics commented 3 years ago

I'm not sure how useful it is, when it first segfaults, the stack trace is:

cr_signal_handler(int sig, siginfo_t * si, void * uap) (/path-to-cr/cr.h:1563)
libpthread.so.0!<signal handler called> (Unknown Source:0)
...my code here...

Then cr_signal_handler siglongjmp's to cr_plugin_main which returns -1. At this point cr's logs are:

CR_TRACE: cr_signal_handler
1 FAILURE: 11 (CR: 1)

After cr_plugin_main returns, execution continues from "somewhere" (I don't quite understand why in that particular place, its deep in the standard library code, in logic related to a mutex my code uses, but is not the place where my code segmentation faulted) and this code then segmentation faults again, at some point. cr_signal_handler is called again, which jumps into cr_plugin_main again and at this point, ctx seems to be invalid as this line now segmentation faults:

ctx.version = ctx.last_working_version;

And now we're in an infinite cycle of cr_signal_handler -> cr_plugin_main -> ctx.version = ctx.last_working_version; -> segfault.

fungos / cr

Crash protection prevents host crashes, causing application to freeze #61