jeremy-rifkin / cpptrace

Simple, portable, and self-contained stacktrace library for C++11 and newer
MIT License
621 stars 64 forks source link

Support for musl/uclibc-ng #128

Closed xokdvium closed 3 months ago

xokdvium commented 3 months ago

Currently cpptrace fails to build for non-glibc standard C libs due to the usage of dladdr1, which isn't implemented by musl-libc or uclibc-ng.

For both of those libs it should be possible to use the implementation for apple, which uses dladdr. I see that there's now a fallback for CPPTRACE_HAS_DL_FIND_OBJECT to dladdr1. It should be possible to implement a more robust fallback scheme: _dl_find_object -> dladdr1 -> dladdr for linux. This should be enough to support all implementations.

I've tested that the apple implementation works fine for musl by applying the following (very crude) patch:

diff --git a/src/binary/object.hpp b/src/binary/object.hpp
index 2c812e5..3e988fa 100644
--- a/src/binary/object.hpp
+++ b/src/binary/object.hpp
@@ -22,7 +22,7 @@

 namespace cpptrace {
 namespace detail {
-    #if IS_LINUX
+    #if IS_LINUX && !IS_MUSL
     inline std::string resolve_l_name(const char* l_name) {
         if(l_name != nullptr && l_name[0] != 0) {
             return l_name;
@@ -78,7 +78,7 @@ namespace detail {
         return frame;
     }
     #endif
-    #elif IS_APPLE
+    #elif IS_APPLE || IS_MUSL
     // macos doesn't have dladdr1 but it seems its dli_fname behaves more sensibly?
     // dladdr queries are needed to get pre-ASLR addresses and targets to run addr2line on
     inline object_frame get_frame_object_info(frame_ptr address) {
diff --git a/src/utils/common.hpp b/src/utils/common.hpp
index ad4c191..a2c9dfd 100644
--- a/src/utils/common.hpp
+++ b/src/utils/common.hpp
@@ -18,6 +18,21 @@
  #error "Unexpected platform"
 #endif

+#define IS_MUSL 0
+#ifndef _GNU_SOURCE
+  #define _GNU_SOURCE
+  #include <features.h>
+  #ifndef __USE_GNU
+    #define IS_MUSL 1
+  #endif
+  #undef _GNU_SOURCE
+#else
+  #include <features.h>
+  #ifndef __USE_GNU
+    #define IS_MUSL 1
+  #endif
+#endif
+
 #define IS_CLANG 0
 #define IS_GCC 0
 #define IS_MSVC 0
jeremy-rifkin commented 3 months ago

Hi, thanks for opening this. This sounds reasonable to me, though there is a subtle reason why dladdr1 is needed on linux and I need to check if dladdr on those non-standard C library implementations behaves in a way that's acceptable.

I switched to using dladdr1 in https://github.com/jeremy-rifkin/cpptrace/commit/b125248b321aeef8d639f83ab3eab7b5af36dc0c due to dli_fname from glibc being unreliable for the current executable as it relies on argv[0]. dladdr is still used on apple because apple doesn't have dladdr1 and dli_fname ended up being reliable.

xokdvium commented 3 months ago

I've taken a quick look at musl's implementation of dladdr and it seems quite reasonable. I don't see any argv[0] quirks either.

The implementation can be found here: https://git.musl-libc.org/cgit/musl/tree/ldso/dynlink.c?h=v1.2.5&id=0784374d561435f7c787a555aeab8ede699ed298#n2294.

From my experiments the stack trace for an exception thrown from boost::asio::thread_pool looks exactly the same as for when building with glibc. I'm not sure if the corner case you mentioned would show up in such example, but either way I think it's quite promising.

....
#18 0x55df4f8ff3600000 in boost::asio::thread_pool::thread_function::operator()() at /build/source/build/include/boost/asio/impl/thread_pool.ipp:39:19
#19 0x55df4f8ff3280000 in boost::asio::detail::posix_thread::func<boost::asio::thread_pool::thread_function>::run() at /build/source/build/include/boost/asio/detail/posix_thread.hpp:86:7
#20 0x55df4f8fd8400000 in boost_asio_detail_posix_thread_function at /build/source/build/include/boost/asio/detail/impl/posix_thread.ipp:74:13
#21 0x7fbec18864430000 at /nix/store/y4jf3af835js0hgk2795arv580jdx3v8-musl-1.2.3/lib/ld-musl-x86_64.so.1
#22 0x7fbec188898e0000 at /nix/store/y4jf3af835js0hgk2795arv580jdx3v8-musl-1.2.3/lib/ld-musl-x86_64.so.1
jeremy-rifkin commented 3 months ago

Thanks! I've pushed a change to dev that falls back to normal dladdr. I've verified with a musl build that abnormal argv[0] won't cause a problem with exec -a demo build/demo and it generated traces correctly. I'm going to do some more research to try to figure out how to best do this fallback safely.

jeremy-rifkin commented 3 months ago

I have discovered dladdr1 was added to glibc in 2003 and I don't think I have to worry about supporting anyone using a glibc more than two decades old.

xokdvium commented 3 months ago

I'll hijack this issue a bit. It'd be nice to see this fix and https://github.com/jeremy-rifkin/cpptrace/commit/d7c19a5544fb9de405794b4d07b99d0c6e30f579 in the upcoming release, since I'd like to package this library for https://github.com/nixos/nixpkgs. With all the portability issues resolved in a stable release tag it would be a trivial packaging task. I can go ahead and pick from the latest trunk, but I hope this can warrant a minor version bump?

Thanks a lot!

jeremy-rifkin commented 3 months ago

I'm hoping to do a release soon, basically as soon as I can implement #129!

jeremy-rifkin commented 3 months ago

Thanks for your patience, I've released v0.6.0