Closed hrishikeshSuresh closed 5 years ago
Thanks! I'll take a look.
One thing I can tell right now:
print_backtrace_emergency() is called from a signal handler (e.g. when we receive SIGSEGV). This means that this function should be signal-safe, as mentioned here:
You can read about signal-safety here: http://man7.org/linux/man-pages/man7/signal-safety.7.html
In particular, signal-safe function is allowed to call only other signal-safe functions. malloc and fprintf, and all other functions that work with FILE*
or perform allocations, are not signal-safe, so we can't use them in print_backtrace_emergency().
Thus, print_backtrace_emergency() is intended to be a simplified version of print_backtrace() that can have limited functionality but can be called from a signal handler. For example, our implementation of print_backtrace_emergency() for glibc does not perform demangling and formatting at all, it just calls backtrace_symbols_fd() which is signal-safe.
libunwind documentation: (https://www.nongnu.org/libunwind/man/libunwind(3).html#section_5) says:
The manual page for each libunwind routine identifies whether or not it is signal-safe, but as a general rule, any routine that may be needed for local unwinding is signal-safe (e.g., unw_step() for local unwinding is signal-safe). For remote-unwinding, none of the libunwind routines are guaranteed to be signal-safe.
So for libunwind, we should:
check manual page for every libunwind function and ensure that we only perform local unwinding and only call its signal-safe functions
in our print_backtrace_emergency() implementation we also should not call any functions from standard library that are not signal-safe, e.g. fprintf() and malloc()
On bionic, we also should find a signal-safe way to print backtrace, or we can leave print_backtrace_emergency() unimplemented for now.
I couldn't add the optional dependency on libunwind.h, so can any hints/ideas be given as to how this has to be done?
This should be done in SConstruct
file. Basically you can just duplicate what we're doing for sox and adjust it for libunwind, specifically:
add --disable-libunwind (like --disable-sox)
automatically append target_libunwind to ROC_TARGETS if we use musl and if --disable-libunwind is not specified (like we do it for target_sox)
check whether libunwind is available on system (like we do it in if 'target_sox' in system_dependecies:
branch for sox)
download and build libunwind if it's present in --build-3rdparty option (like we do it in if 'target_sox' in download_dependencies:
branch for sox)
add libunwind branch to our 3rdparty.py
script
If you will have any trouble, feel free to ask here or in the IRC chat, I'll help you with this.
If you will not find a good way to print backtrace in a signal-safe manner on Bionic, there is another option. If libunwind works good on Anrdoid (could you check it please?) we can use it on Bionic too.
In this case we can have three backtrace implementations:
target_glibc - selected if we're using glibc; prints backtrace using glibc functions
target_libunwind - selected if we're using musl or bionic and libunwind was not disabled by user; prints backtrace using libunwind functions
target_nobacktrace - selected if we're using musl or bionic and libunwind was disabled by user; no-op implementation
/link #242
I am printing the backtrace in two different ways for each backtrace.cpp files. Which one should I stick to?
So the answer to this question is that we need both versions: print_backtrace() is full-featured version that can perform allocations, demangling, use fprintf, etc; and print_backtrace_emergency() is a limited version that can be called from a signal handler.
Since print_backtrace_emergency() can't use fprintf(), it should write to the stderr file descriptor directly, but it doesn't make sense to use fdopen() because it's not signal-safe.
I made print_backtrace_emergency() signal-safe by making use of write() instead of fprintf() in target_musl/roc_core/backtrace.cpp
. Demangling operation for print_backtrace() was missed in the earlier commit for musl, so that has been added now.
Thanks, I've reviewed the libunwind version. Except the comments above, it's looking good.
What are your plans for the bionic version?
BTW, if you want, we could split the PR into two parts: one PR for libunwind and another PR for bionic.
I was checking if libunwind can be used for android and in this file, it says the android libunwind api is compatible with the non-gnu libunwind api. So can we use libunwind with bionic? And go ahead with three implementations - target_glibc, target_libunwind, target_nobacktrace like how you had mentioned earlier.
Interesting. I've googled about this version of libunwind a bit.
I've found this repo: https://github.com/alexeikh/android-ndk-backtrace-test
Here is a summary.
We can use libunwind from Android NDK to print backtrace from a signal handler. But the code will be architecture-specific.
We can use _Unwind_Backtrace() from Bionic to print backtrace from a signal handler as well. The code will be architecture-specific too.
We can use both libunwind and _Unwind_Backtrace() in an architecture-independent manner, but in this case it will not be possible to use it from a signal handler (we will not see the pre-signal stack).
Adding architecture-specific code and maintaining different versions for different CPUs for such a minor feature would be an overkill. So we're throwing away the first two options.
Then, Android NDK ships with pre-compiled libunwind.a (not sure if it's present on all architectures through), but it doesn't provide libunwind headers. So we will need to ship them manually if we want to use libunwind. The problem with this approach is that these headers may be specific to NDK version, so I'd prefer to avoid this.
Thus, it seems that the best option is to implement print_backtrace() using _Unwind_Backtrace() and to leave print_backtrace_emergency() empty.
But then I've found this page: https://source.android.com/devices/tech/debug
It says that if an Android application crashes, you can find its backtrace in the tombstone file. We should check whether the backtrace will be present there in case of roc_panic(). If it will, we can avoid implementing backtrace on Android at all and just rely on this feature. Unfortunately, I can't test it right now because my Android phone is broken :)
So if you can test it, please do it, and we will see if we need backtrace on Android. If we need it, we should use _Unwind_Backtrace(). If we don't need it, we can leave the bionic version empty.
If you can't test it, you can leave the bionic version as is (empty) until someone will have time for it.
On the other hand, after some thought, it seems that it would be handy to see the backtrace of a panic in stderr (even if it's also present in tombstone), because the rest panic message is printed to stderr too, and the backtrace is actually a part of this message. And since we already have the code to print it (in this PR), I think we can keep it.
So to sum up, I think we need to do the following:
In result, we will have four backtrace implementations:
I've added libunwind to our alpine environment for CI: https://github.com/roc-project/dockerfiles/commit/4394200fb3a9f83b98a8d522d87fe42b7ac45104
(alpine image is the only one that uses musl and so the only one where we should enable libunwind by default)
In this commit, I haven't replaced write()
with print_emergency_message()
, which I will do it later, but I think everything else is done for libunwind. Also, I will start working on SConstruct file.
About checking the backtrace for Android, I can't test right now. Maybe I can do it later, and in case, we need to implement some code, we'll a separate PR for that. For now, print_backtrace_emergency()
is unimplemented for bionic.
Great. One more round :)
OK, so the remaining issues are:
In SConstruct file, I am not sure what exactly to be done in part we are have to build libunwind (if 'target_libunwind' in download_dependencies
) part, because in target_sox
we are checking for a lot of dependent libraries and for libunwind, there are no such dependencies.
Please re-target pull request to the develop
branch (see https://roc-project.github.io/roc/docs/development/version_control.html#pull-requests)
Problem Summary Implement backtrace printing for non-glibc targets (bionic and musl)
Solution In
./src/modules/roc_core/target_bionic/roc_core/backtrace.cpp
, I have implemented captureBacktrace(), to find the size of backtrace stack. In captureBacktrace(), we call _Unwind_Backtrace() to perform stack unwinding through unwind data, using the unwindCallback() function. dumpBacktrace() will first check whether backtrace is available or not and then print the following in the format of#index: address_of_the_cursor function_name
.Here, demangled name is printed if the demangling operation suceeds. dumpBacktrace_fd() is same as dumpBacktrace() but prints the output to the file parameter passed.
In
./src/modules/roc_core/target_musl/roc_core/backtrace.cpp'
, backtrace() finds the size of backtrace stack by just iterating till the end of the stack. backtrace_symbols() does the same thing but we print the following in the format of#index : function_name address_of_cursor offset instruction_pointer
to stderr. backtrace_symbols_fd() does the same but writes the output to the given file descriptor.
Queries 1 . I am printing the backtrace in two different ways for each backtrace.cpp files. Which one should I stick to? 2 . I couldn't add the optional dependency on libunwind.h, so can any hints/ideas be given as to how this has to be done?