Closed tsvstar closed 1 month ago
Can you try with version 0.6.2? If you’re on centos 7 dladdr1 should exist but there was an issue with it not being selected properly in 0.6.1.
On 0.6.2 the second problem (Unable to read object file main
) was gone. Now the correct path to binary watched.
But the first problem (core) still exists on the same place.
Thanks, glad the other issue was resolved. A segfault is concerning, unfortunately I probably need to be able to repro to debug. I can spin up a centos 7 box later to try. Can you send the exact code you’re using just so I can be sure I am using the same setup? Another helpful data point, if it’s not too much work, would be to turn on address sanitizer (both for the library and binary) and see if it catches anything.
To answer the question at the end of the issue:
I'm not interested in object file information, but I'm interested in the line information. I'm not sure why this info is needed and if the library will still be workable without loading object files.
Object information has to be resolved in order to get line info. For a given instruction pointer in runtime address space, cpptrace must figure out what object it came from (either the main executable or a library .so), then any runtime address randomization must be reversed, then any offsets from the ELF must be reversed, and only then can the appropriate debug information be looked up in the executable/library object.
I setup a centos 7 VM but was unable to reproduce. I used devtoolset-10 while testing.
Here are my results.
1) I create a small, simple project that adds the library and runs basic calls. It works ok for "Debug" build.
For RelWithDebugInfo build doesn't resolve symbols (symbol
in frames
made by cpptrace::generate_trace()
are empty).
.print()
output has appearance "#0 0x0000000000406e24 at /devel/cpptrace/my/MyProject"
2) I make a RelWithDebugInfo+sanitizer library build and add it to my project as an external lib. It cores with the same trace as in the first message. Sanitizer wasn't triggered.
3) I add a lot of debugging output (particularly adding all ctor/dtor/assign to stacktrace_frame
to track them).
This reveals a strange thing - first multiple null_frame
objects were created, then symbols_with_libdwarf.cpp:92 std::vector trace
create numerous copies of them - and surprisingly at this point both string values has c_str()
which points to inaccessible memory.
4) I implode the library into the project (copy src
, include
and CMakeLists.txt
, and add as a subdirectory).
After a few fixes (set paths, configuration), I was able to run it in pure Debug+sanitizer.
First, no CPPTRACE_GET_SYMBOLS_WITH_
and CPPTRACE_CPPTRACE_DEMANGLE_WITH_
symbols were defined so details::resolve_frames
did nothing. But immediately after that, it was cored on cpptrace::detail::demangle
because frame.symbol
refers to inaccessible memory. Still no sanitizer report.
5) Then I defined explicitly CPPTRACE_GET_SYMBOLS_WITH_LIBDWARF
and CPPTRACE_CPPTRACE_DEMANGLE_WITH_CXXABI
—and now the library almost works. It doesn't core; it shows the correct address but doesn't show the symbol.
#0 cpptrace::detail::libdwarf::dwarf_resolver::retrieve_symbol (this=0x613000032380, cu_die=..., pc=187963276, dwversion=4, frame=..., inlines=std::vector of length 0) at src/cpptrace/src/symbols/dwarf/dwarf_resolver.cpp:595
#1 in cpptrace::detail::libdwarf::dwarf_resolver::resolve_frame_core (this=0x613000032380, object_frame_info=..., frame=..., inlines=std::vector of length 0) at src/cpptrace/src/symbols/dwarf/dwarf_resolver.cpp:1017
#2 in cpptrace::detail::libdwarf::dwarf_resolver::resolve_frame (this=0x613000032380, frame_info=...) at /src/cpptrace/src/symbols/dwarf/dwarf_resolver.cpp:1053
#3 in cpptrace::detail::libdwarf::dwarf_resolver::perform_dwarf_fission_resolution (this=0x613000034840, cu_die=..., dwo_name=..., object_frame_info=..., frame=..., inlines=std::vector of length 0) at src/cpptrace/src/symbols/dwarf/dwarf_resolver.cpp:990
#4 in cpptrace::detail::libdwarf::dwarf_resolver::resolve_frame_core (this=0x613000034840, object_frame_info=..., frame=..., inlines=std::vector of length 0) at src/cpptrace/src/symbols/dwarf/dwarf_resolver.cpp:1014
#5 in cpptrace::detail::libdwarf::dwarf_resolver::resolve_frame (this=0x613000034840, frame_info=...) at src/cpptrace/src/symbols/dwarf/dwarf_resolver.cpp:1053
#6 in cpptrace::detail::libdwarf::resolve_frames (frames=std::vector of length 17 = {...}) at src/cpptrace/src/symbols/symbols_with_libdwarf.cpp:103
#7 in cpptrace::detail::resolve_frames (frames=std::vector of length 17 = {...}) at src/cpptrace/src/symbols/symbols_core.cpp:134
#8 in cpptrace::raw_trace::resolve (this=0x7fbdf228e280) at src/cpptrace/src/cpptrace.cpp:48
preprocess_subprograms
always return empty vector.
UPD: I added the same options as in prod to the simple project and now sanitizer triggering on heap-use-after-free. MyProject.zip sanitizer_report.txt
This information is immensely helpful, thanks so much for taking the time to look into this. It's especially helpful to know it might be an issue in the split dwarf code path. Something does definitely seem wrong so I'll try to understand what's going on here.
I'm was able to reproduce a sanitizer error in your setup doing a RelWithDebInfo build. I built cpptrace with sanitizers and then used a trimmed cmake file:
cmake_minimum_required(VERSION 3.10)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fno-omit-frame-pointer -fno-optimize-sibling-calls -gdwarf-4 -gsplit-dwarf")
project(repro CXX)
list(APPEND CMAKE_PREFIX_PATH "path/to/projects/cpptrace/build/foo")
find_package(cpptrace REQUIRED)
add_executable(repro main.cpp)
target_link_libraries(repro PRIVATE cpptrace::cpptrace)
target_link_options(repro PRIVATE -gsplit-dwarf -fsanitize=address)
==3794==ERROR: AddressSanitizer: heap-use-after-free on address 0x6060000e3da0 at pc 0x55d33537e687 bp 0x7ffdfdd0d870 sp 0x7ffdfdd0d860
READ of size 8 at 0x6060000e3da0 thread T0
#0 0x55d33537e686 in dwarf_dealloc_die /mnt/c/Users/rifkin/home/projects/cpptrace/build/_deps/libdwarf-src/src/lib/libdwarf/dwarf_alloc.c:786
#1 0x55d33531cdaf in cpptrace::detail::libdwarf::die_object::~die_object() /mnt/c/Users/rifkin/home/projects/cpptrace/src/symbols/dwarf/../../utils/dwarf.hpp:71
#2 0x55d33532fda3 in cpptrace::detail::libdwarf::skeleton_info::~skeleton_info() /mnt/c/Users/rifkin/home/projects/cpptrace/src/symbols/dwarf/dwarf_resolver.cpp:74
#3 0x55d335346a4c in cpptrace::detail::optional<cpptrace::detail::libdwarf::skeleton_info, 0>::reset() /mnt/c/Users/rifkin/home/projects/cpptrace/src/symbols/dwarf/../../binary/../utils/utils.hpp:293
#4 0x55d335337085 in cpptrace::detail::optional<cpptrace::detail::libdwarf::skeleton_info, 0>::~optional() /mnt/c/Users/rifkin/home/projects/cpptrace/src/symbols/dwarf/../../binary/../utils/utils.hpp:212
#5 0x55d33532387d in cpptrace::detail::libdwarf::dwarf_resolver::~dwarf_resolver() /mnt/c/Users/rifkin/home/projects/cpptrace/src/symbols/dwarf/dwarf_resolver.cpp:216
#6 0x55d3353239ad in cpptrace::detail::libdwarf::dwarf_resolver::~dwarf_resolver() /mnt/c/Users/rifkin/home/projects/cpptrace/src/symbols/dwarf/dwarf_resolver.cpp:216
#7 0x55d33534bee6 in std::default_delete<cpptrace::detail::libdwarf::dwarf_resolver>::operator()(cpptrace::detail::libdwarf::dwarf_resolver*) const /usr/include/c++/11/bits/unique_ptr.h:85
#8 0x55d33533ef60 in std::unique_ptr<cpptrace::detail::libdwarf::dwarf_resolver, std::default_delete<cpptrace::detail::libdwarf::dwarf_resolver> >::~unique_ptr() /usr/include/c++/11/bits/unique_ptr.h:361
#9 0x55d335367323 in std::pair<unsigned long long const, std::unique_ptr<cpptrace::detail::libdwarf::dwarf_resolver, std::default_delete<cpptrace::detail::libdwarf::dwarf_resolver> > >::~pair() /usr/include/c++/11/bits/stl_pair.h:211
I realized the issue, it's a very subtle issue with destructors I'd ran into previously but didn't handle with split dwarf. Fix should be the following, will commit later
diff --git a/src/symbols/dwarf/dwarf_resolver.cpp b/src/symbols/dwarf/dwarf_resolver.cpp
index 7627cd6..1c47a17 100644
--- a/src/symbols/dwarf/dwarf_resolver.cpp
+++ b/src/symbols/dwarf/dwarf_resolver.cpp
@@ -208,6 +208,8 @@ namespace libdwarf {
}
// subprograms_cache needs to be destroyed before dbg otherwise there will be another use after free
subprograms_cache.clear();
+ split_full_cu_resolvers.clear();
+ skeleton.reset();
if(aranges) {
dwarf_dealloc(dbg, aranges, DW_DLA_LIST);
}
It seems that fixes the heap-use-after-free issue.
But on the production I still not be able to get symbols. I mean I correctly see file - line - column, but symbol is empty.
(please take a look item 5 in this my message).
Turning on flags dump_dwarf
and trace_dwarf
give nothing valuable.
Starting resolution for ..../src/file.c.dwo 0b34178c
..../src/file.c.dwo
b34178c
End walk_dbg
and then hundreds of End walk_die_list
lines.
The given path and file file.c.dwo
exist.
Hi, there’s a chance this could be related to an upstream libdwarf issue regarding how rangelists are handled. Does item 5 happen when you haven’t pulled cpptrace into your project directly and instead link to a copy built elsewhere?
Two issues occurred when I tried to use the library for a big project on Linux. Maybe they depend on each other, but maybe don't. The project is built on CentOS Linux 7, using gcc-11.2 and is statically linked.
1. Coredump
Do nothing special. Just try to get stacktrace.
2. Fail to read object file.
The same
generate_trace
call reports multiple errorsCpptrace internal error: Unable to read object file main
before core. Although I'm not able to enter by debugger inside of cpptrace (no matter if it was built in ReleaseWithDebug or Debug or Release mode), its modification reveals that this happened inside of the#else
branch ofobject.cpp
(!defined(CPPTRACE_HAS_DL_FIND_OBJECT) && !defined(HAS_DLADDR1)
"main" is what is mentioned in the CmakeList.txt in
target_link_libraries(main ....)
and the name of the directory. It is also mentioned by theboost::stacktrace
output (... in main
) for that stacktrace. But no "main" file exists at all, and moreover, in the current directory, the executable has a different name (cserver
).I'm not interested in object file information, but I'm interested in the line information. I'm not sure why this info is needed and if the library will still be workable without loading object files.