jeremy-rifkin / cpptrace

Simple, portable, and self-contained stacktrace library for C++11 and newer
MIT License
621 stars 64 forks source link

Stacktrace empty when catching exception on arm #134

Closed Silex closed 2 months ago

Silex commented 3 months ago

Hello,

Thanks for the lib, I like the design.

I have a bug with the following snippet:

  try
  {
    throw cpptrace::runtime_error("oh noes");
  }
  catch(cpptrace::exception& e)
  {
    std::cout << e.message() << std::endl;
    e.trace().print(std::cerr, false);
    std::clog << "-- what" << std::endl;
    std::cout << e.what() << std::endl;
  }

  std::clog << "-- callback" << std::endl;
  std::clog << cpptrace::generate_trace().to_string(false) << std::endl;

It produces the following output

2024-06-05T10:37:15.945+00:00 axis-accc8e6929f3 [ ERR     ] test[25257]: Exception: oh noes
2024-06-05T10:37:16.945+00:00 axis-accc8e6929f3 [ INFO    ] test[25257]: Stack trace (most recent call first):
2024-06-05T10:37:16.945+00:00 axis-accc8e6929f3 [ INFO    ] test[25257]: <empty trace>
2024-06-05T10:37:19.945+00:00 axis-accc8e6929f3 [ INFO    ] test[25257]: -- what
2024-06-05T10:37:18.945+00:00 axis-accc8e6929f3 [ ERR     ] test[25257]: Exception: oh noes: Stack trace (most recent call first): <empty trace>
2024-06-05T10:37:19.945+00:00 axis-accc8e6929f3 [ INFO    ] test[25257]: -- callback
2024-06-05T10:37:19.945+00:00 axis-accc8e6929f3 [ INFO    ] test[25257]: #0 0x004828d1 in storage::subscribe(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /opt/app/storage.cpp:61:40
2024-06-05T10:37:19.945+00:00 axis-accc8e6929f3 [ INFO    ] test[25257]: #1 0x0047f541 in application::setup_storage() at /opt/app/application.cpp:62:22
2024-06-05T10:37:19.945+00:00 axis-accc8e6929f3 [ INFO    ] test[25257]: #2 0x0047f6d9 in application::run() at /opt/app/application.cpp:27:16
2024-06-05T10:37:19.945+00:00 axis-accc8e6929f3 [ INFO    ] test[25257]: #3 0x0047ed0b in main at /opt/app/main.cpp:55:12
2024-06-05T10:37:19.945+00:00 axis-accc8e6929f3 [ INFO    ] test[25257]: #4 0x767da549 at /usr/lib/libc.so.6
2024-06-05T10:37:19.945+00:00 axis-accc8e6929f3 [ INFO    ] test[25257]: #5 0x767da5e5 at /usr/lib/libc.so.6

As you see generating the trace works but not when catching exceptions.

This is a special environment: the code is cross-compiled for armv7hf/aarch64 (it runs on AXIS cameras).

Here is the command line that builds the application:

arm-linux-gnueabihf-g++  -mthumb -mfpu=neon -mfloat-abi=hard -mcpu=cortex-a9 -fstack-protector-strong  -O2 -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security -Werror=format-security --sysroot=/opt/axis/acapsdk/sysroots/armv7hf -L/opt/axis/acapsdk/sysroots/armv7hf/usr/lib -O2 -g -pipe -std=c++23 -I/opt/axis/acapsdk/sysroots/armv7hf/usr/include/axsdk -I/opt/axis/acapsdk/sysroots/armv7hf/usr/include/vdo -I/opt/axis/acapsdk/sysroots/armv7hf/usr/include/gio-unix-2.0 -I/opt/axis/acapsdk/sysroots/armv7hf/usr/include/glibmm-2.68 -I/opt/axis/acapsdk/sysroots/armv7hf/usr/lib/glibmm-2.68/include -I/opt/axis/acapsdk/sysroots/armv7hf/usr/include/glib-2.0 -I/opt/axis/acapsdk/sysroots/armv7hf/usr/lib/glib-2.0/include -I/opt/axis/acapsdk/sysroots/armv7hf/usr/include/sigc++-3.0 -I/opt/axis/acapsdk/sysroots/armv7hf/usr/lib/sigc++-3.0/include -L./lib -Wl,--no-as-needed,-rpath,'$ORIGIN/lib' -lcpptrace -laxstorage -laxparameter -lvdostream -lgio-2.0 -lsystemd -lglibmm-2.68 -lgobject-2.0 -lglib-2.0 -lsigc-3.0 application.cpp main.cpp parameters.cpp storage.cpp stream.cpp -o test; \

And attached is the CMakeCache.txt that was used to build cpptrace, so you can look at the CXXFLAGS etc.

CMakeCache.txt

jeremy-rifkin commented 3 months ago

Hello, thanks for the bug report! I have tried the code locally and unfortunately I haven't been able to reproduce, though this definitely seems like a bug. It seems there's an issue with the internal handling of traces in the cpptrace exception objects as opposed to a problem with tracing directly, though I haven't been able to reproduce it with this code. There's also a small chance there could be some unexpected behavior from arm-linux-gnueabihf-g++ surrounding how an exception object is constructed on a throw.

What version of the library are you using?

Silex commented 3 months ago

Latest stable from git, here's how I build it:

# Build cpptrace
RUN git clone --branch v0.6.0 https://github.com/jeremy-rifkin/cpptrace.git && \
    mkdir cpptrace/build && \
    cd cpptrace/build && \
    . /opt/axis/acapsdk/environment-setup* && \
    cmake .. \
    -D BUILD_SHARED_LIBS=1 \
    -D CMAKE_BUILD_TYPE=RelWithDebInfo \
    -D CMAKE_INSTALL_PREFIX=$PREFIX \
    -D CMAKE_CXX_COMPILER=${CXX%-g++*}-g++ \
    -D CMAKE_CXX_FLAGS="${CXX#*-g++}" \
    -D CMAKE_C_COMPILER=${CC%-gcc*}-gcc \
    -D CMAKE_C_FLAGS="${CC#*-gcc}" && \
    make -j$(nproc) && \
    make install

Without BUILD_SHARED_LIBS it would only build the static version (.a), and it resulted in undefined references at link time which I didn't understand why so I built the .so instead.

Silex commented 2 months ago

@jeremy-rifkin: given the normal trace works, is there a way to use that when throwing exception? I mean, disabling the "lazy-exception" mechanism?

If not, can you point me toward what I should modify if I want that?

jeremy-rifkin commented 2 months ago

Hey, thanks for your patience. I'd still like to solve this but I haven't been able to look into this further or reproduce yet.

As far as modifying the exception behavior, the way the library is setup the cpptrace::exception/error classes are meant to serve as a reference implementation for how something like this can be done. It should be easy to roll your own that skips the raw trace and just generates a full trace right off the bat, e.g.

class my_exception {
    std::string message;
public
    my_exception(
        std::string message,
        cpptrace::stacktrace trace = cpptrace::generate_trace()
    ) : message(message + "\n" + trace.to_string()) {}
    const char* what() const noexcept {
        return message.c_str();
    }
};

(haven't tested but that should work)

Silex commented 2 months ago

Thanks. What I really liked in your lib is just replacing std::runtime_error with cpptrace::runtime_error (etc) and have the workflow like what you'd expect.

Of course I could quickly redo them like myapp::runtime_error but the idea was to do some patching to cpptrace::exception prior to building the lib so it'd behave like your snippet above.

jeremy-rifkin commented 2 months ago

I think I might have a fix which I've just pushed to jr/try-fix-134. Do you think you could try 2e981c89a5af9d91e1367df3d4f141508d443a72 locally and see if this fixes things for you?

Silex commented 2 months ago

@jeremy-rifkin: thanks!

So, I discovered that cameras running on aarch64 don't have this problem (v0.6.0 works).

On armv7hf, your fix works!

v0.6.0:

2024-06-13T07:49:28.324+00:00 axis-accc8ee2384f [ INFO    ] arqivis[18940]: Application: run
2024-06-13T07:49:30.324+00:00 axis-accc8ee2384f [ INFO    ] arqivis[18940]: Stack trace (most recent call first):
2024-06-13T07:49:30.324+00:00 axis-accc8ee2384f [ INFO    ] arqivis[18940]: #0 0x0048774b in storage::subscribe(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /opt/app/storage.cpp:46:40
2024-06-13T07:49:30.324+00:00 axis-accc8ee2384f [ INFO    ] arqivis[18940]: #1 0x004844f1 in application::setup_storage() at /opt/app/application.cpp:62:22
2024-06-13T07:49:30.324+00:00 axis-accc8ee2384f [ INFO    ] arqivis[18940]: #2 0x00484683 in application::run() at /opt/app/application.cpp:27:16
2024-06-13T07:49:30.324+00:00 axis-accc8ee2384f [ INFO    ] arqivis[18940]: #3 0x00483ce5 in main at /opt/app/main.cpp:54:12
2024-06-13T07:49:30.324+00:00 axis-accc8ee2384f [ INFO    ] arqivis[18940]: #4 0x7682a549 at /usr/lib/libc.so.6
2024-06-13T07:49:30.324+00:00 axis-accc8ee2384f [ INFO    ] arqivis[18940]: #5 0x7682a5e5 at /usr/lib/libc.so.6

2024-06-13T07:49:31.324+00:00 axis-accc8ee2384f [ ERR     ] arqivis[18940]: Exception: oh noes
2024-06-13T07:49:31.324+00:00 axis-accc8ee2384f [ INFO    ] arqivis[18940]: Stack trace (most recent call first):
2024-06-13T07:49:31.324+00:00 axis-accc8ee2384f [ INFO    ] arqivis[18940]: <empty trace>

jr/try-fix-134:

2024-06-13T07:12:21.324+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: Application: run
2024-06-13T07:12:23.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: Stack trace (most recent call first):
2024-06-13T07:12:23.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #0 0x00427749 in storage::subscribe(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /opt/app/storage.cpp:46:40
2024-06-13T07:12:23.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #1 0x004244f1 in application::setup_storage() at /opt/app/application.cpp:62:22
2024-06-13T07:12:23.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #2 0x00424683 in application::run() at /opt/app/application.cpp:27:16
2024-06-13T07:12:23.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #3 0x00423ce5 in main at /opt/app/main.cpp:54:12
2024-06-13T07:12:23.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #4 0x7685a549 at /usr/lib/libc.so.6
2024-06-13T07:12:23.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #5 0x7685a5e5 at /usr/lib/libc.so.6

2024-06-13T07:12:24.323+00:00 axis-accc8ee2384f [ ERR     ] arqivis[15220]: Exception: oh noes
2024-06-13T07:12:24.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: Stack trace (most recent call first):
2024-06-13T07:12:24.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #0 0x004277c9 in storage::subscribe(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /opt/app/storage.cpp:48:3
2024-06-13T07:12:24.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #1 0x004244f1 in application::setup_storage() at /opt/app/application.cpp:62:22
2024-06-13T07:12:24.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #2 0x00424683 in application::run() at /opt/app/application.cpp:27:16
2024-06-13T07:12:24.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #3 0x00423ce5 in main at /opt/app/main.cpp:54:12
2024-06-13T07:12:24.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #4 0x7685a549 at /usr/lib/libc.so.6
2024-06-13T07:12:24.323+00:00 axis-accc8ee2384f [ INFO    ] arqivis[15220]: #5 0x7685a5e5 at /usr/lib/libc.so.6

To be honest I really don't get why your fix works. Can you shed some light?

The only diff I can see is that armv7hf is built with:

arm-linux-gnueabihf-g++ -mthumb -mfpu=neon -mfloat-abi=hard -mcpu=cortex-a9

And that aarch64 is built with:

aarch64-linux-gnu-g++ -mcpu=cortex-a53 -march=armv8-a+crc+crypto -mbranch-protection=standard

jeremy-rifkin commented 2 months ago

Awesome! I am glad it works. The fix is very surprising, my best understanding is that under armv7hf unwinding tables are needed and when get_raw_trace_and_absorb is noexcept unwind tables aren’t generated for the function. Credit to @easyaspi314 for this fix :)

I’ll go ahead and merge this into dev and it’ll be fixed in the next release.

Silex commented 2 months ago

Great! Yeah, for the explanation I don't fully get it but https://stackoverflow.com/questions/26079903/noexcept-stack-unwinding-and-performance helped me understand what it was about.

Silex commented 2 months ago

@jeremy-rifkin: any ETA for the next release?

jeremy-rifkin commented 2 months ago

I'll aim to do a patch release this weekend!

Silex commented 2 months ago

Thanks, ping me when it's done :-)

jeremy-rifkin commented 2 months ago

Apologies for the delay, release will be today

jeremy-rifkin commented 2 months ago

Release is now out, thanks for your patience!

Silex commented 2 months ago

Thanks!