Open umanwizard opened 4 years ago
Yes. I have mentioned this in README. (oops, it seems not clear enough)
Unfortunately, there is no 100% robust stack tracing method. Some related researches have been done by gperftools. pprof-rs uses backtrace-rs which finally uses libunwind provided by libgcc
WARN: as described in former gperftools documents, libunwind provided by libgcc is not signal safe.
libgcc's unwind method is not safe to use from signal handlers. One particular cause of deadlock is when profiling tick happens when the program is propagating thrown exception.
If the signal arrives while the program is getting backtrace (through libgcc) (for sampling, profiling, error handling...), the result is hard to predict (sometimes will crash directly). A possible solution (in my imagination :smile_cat: ) is to scan and find the address of libgcc. In the signal handler, we can judge whether the context (register rip) is in libgcc's part. If it is, pprof-rs
can skip this sampling. But as I am busy with other projects, I have no time to try this method these days :disappointed: .
But it's also not 100% perfect because libgcc's unwind can call other libraries, it's hard to tell whether the current context is in a calling process of unwind
without getting backtrace.
Thank you for the detailed response. I think the best solution is just to turn off other things that might be getting the backtrace (e.g. jemalloc) while using Pprof-rs.
In
perf_signal_handler
,backtrace::trace_unsychronized
is called. This will not produce any bugs if the user is just using pprof-rs, since a lock is taken, so the main body ofperf_signal_handler
cannot be executed more than once at a time.However, if the user is calling
backtrace::trace
from any other part of the code at the same time, this will result in UB.I suspect (but I'm not sure) that this is why we are seeing deadlocks in https://github.com/MaterializeInc/materialize when using both jemalloc heap profiling and pprof-rs profiling at the same time.