Open gdh1995 opened 1 month ago
Probably what we need to do is change gimli to treat 0 as a tombstone address, so that it skips rows until the next valid address.
The ELF file is private, so sorry I can not upload it. How can I help do any test about treating 0 as a tombstone address
?
You can try changing this line to self.tombstone = address == tombstone_address || address == 0;
.
Great! It works well and all 50K+ addresses are parsed successfully.
BTW, how to output a function name like google::LogMessage::LogStream::LogStream(char*, int, long)
, instead of a plain LogStream
? My arguments are: cat /tmp/pprof2702424.sym | time ~/github/addr2line/target/release/addr2line -f -C -e /path/to/main -a -i
.
Um it's a bit strange that sometimes this addr2line
succeeds in demangling, but for quite amounts of other addresses it doesn't.
I'm interested in finding out which symbols it can't demangle. Can you give a few of the mangled symbols that it fails on? (obtained without the -C
).
You could try changing this code to:
demangle(name.as_ref(), gimli::DW_LANG_Rust)
.or_else(|| demangle(name.as_ref(), gimli::DW_LANG_C_plus_plus))
.map(Cow::from)
.unwrap_or(name)
That is, ignore the language
and always try demangling Rust then C++ (or do only C++ if you prefer).
Um the name: Cow<'_, str>
is demangle: "LogStream"
in println!("demangle: {:?}", name);
llvm-addr2line-10 -f -e /path/to/main -a -i
:
0x4e31559
basic_ios
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/basic_ios.h:461
_ZN6google10LogMessage9LogStreamC1EPcil
/proc/self/cwd/bazel-out/k8-opt/bin/external/glog/_virtual_includes/glog/glog/logging.h:1262
This project's ~/github/addr2line/target/debug/addr2line -f -e /path/to/main -a -i
:
0x0000000004e31559
basic_ios
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/basic_ios.h:461
LogStream
/proc/self/cwd/bazel-out/k8-opt/bin/external/glog/_virtual_includes/glog/glog/logging.h:1262
Ah I see, it's not failing to demangle, it's failing to get the mangled name in the first place. Can you run llvm-dwarfdump
on the file and give the output around where the string _ZN6google10LogMessage9LogStreamC1EPcil
occurs?
Another address reports partial name:
This project vs. LLVM-18:
# this project; NOTE that I ran `target/debug/addr2line -f -e -a -i` without `-C`
0x0000000004e48cd0
_M_insert<const std::basic_string<char> &, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::basic_string<char>, true> > > >
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/hashtable.h:1843
# llvm-18 without `-C`:
0x4e48cd0
_ZNSt10_HashtableINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES5_SaIS5_ENSt8__detail9_IdentityESt8equal_toIS5_ESt4hashIS5_ENS7_18_Mod_range_hashingENS7_20_Default_ranged_hashENS7_20_Prime_rehash_policyENS7_17_Hashtable_traitsILb1ELb1ELb1EEEE9_M_insertIRKS5_NS7_10_AllocNodeISaINS7_10_Hash_nodeIS5_Lb1EEEEEEEESt4pairINS7_14_Node_iteratorIS5_Lb1ELb1EEEbEOT_RKT0_St17integral_constantIbLb1EEm
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/hashtable.h:1843
I think both of those are because we are failing to find DW_AT_linkage_name
and falling back to use DW_AT_name
instead. So I need to see the llvm-dwarfdump
output where the linkage name occurs in order to determine what the problem is.
Another possibility is that the linkage name isn't in the DWARF at all, and llvm-addr2line
is getting it from the symbol table instead.
Sorry but llvm-dwarfdump -a
crashes ... I'm trying to test some other arguments.
It should be enough to do --debug-info
, don't need -a
. Or you could try gimli's dwarfdump example instead.
➜ shm nm /path/to/main | grep -w _ZN6google10LogMessage9LogStreamC1EPcil
0000000004e31530 t _ZN6google10LogMessage9LogStreamC1EPcil
➜ shm <pprof-debug-rs-demangling.sym ~/github/addr2line/target/debug/addr2line -f -e /path/to/main -a -i
0x0000000004e31530
LogStream
/proc/self/cwd/bazel-out/k8-opt/bin/external/glog/_virtual_includes/glog/glog/logging.h:1263
➜ shm LANG=C.UTF-8 objdump -W /home/gdh1995/sync/walle3/.cache/68cce18e346283f1d454605f04e495f3/execroot/autocar/bazel-out/k8-opt/bin/common/tools/simulation/simulation_main | grep -B4 -A12 4e31530
DW_CFA_offset: r16 (rip) at cfa-8
DW_CFA_nop
DW_CFA_nop
007405f0 0000000000000034 00000024 FDE cie=007405d0 pc=0000000004e31530..0000000004e3165e
Augmentation data: 53 84 04 ff
DW_CFA_advance_loc: 1 to 0000000004e31531
DW_CFA_def_cfa_offset: 16
DW_CFA_offset: r6 (rbp) at cfa-16
DW_CFA_advance_loc: 3 to 0000000004e31534
DW_CFA_def_cfa_register: r6 (rbp)
DW_CFA_advance_loc: 13 to 0000000004e31541
DW_CFA_offset: r3 (rbx) at cfa-56
DW_CFA_offset: r12 (r12) at cfa-48
DW_CFA_offset: r13 (r13) at cfa-40
DW_CFA_offset: r14 (r14) at cfa-32
DW_CFA_offset: r15 (r15) at cfa-24
--
<c0> DW_AT_name : (indirect string, offset: 0x16a5cfe): basic_ios
<1><c4>: Abbrev Number: 2 (DW_TAG_subprogram)
<c5> DW_AT_name : (indirect string, offset: 0xeb54f): ~basic_ios
<1><c9>: Abbrev Number: 3 (DW_TAG_subprogram)
<ca> DW_AT_low_pc : 0x4e31530
<d2> DW_AT_high_pc : 0x12e
<d6> DW_AT_GNU_all_call_sites: 1
<d6> DW_AT_name : (indirect string, offset: 0xbced95): LogStream
<2><da>: Abbrev Number: 6 (DW_TAG_inlined_subroutine)
<db> DW_AT_abstract_origin: <0xbf>
<df> DW_AT_ranges : 0
<e3> DW_AT_call_file : 1
<e4> DW_AT_call_line : 1262
<e6> DW_AT_call_column : 5
<2><e7>: Abbrev Number: 4 (DW_TAG_inlined_subroutine)
<e8> DW_AT_abstract_origin: <0x2a>
<ec> DW_AT_low_pc : 0x4e3157a
--
<f4e2270> DW_AT_call_line : 680
<f4e2272> DW_AT_call_column : 7
<8><f4e2273>: Abbrev Number: 7 (DW_TAG_inlined_subroutine)
<f4e2274> DW_AT_abstract_origin: <0xf4dc54d>
<f4e2278> DW_AT_ranges : 0x4e31530
<f4e227c> DW_AT_call_file : 5
<f4e227d> DW_AT_call_line : 332
<f4e227f> DW_AT_call_column : 2
<9><f4e2280>: Abbrev Number: 5 (DW_TAG_inlined_subroutine)
<f4e2281> DW_AT_abstract_origin: <0xf4dc557>
<f4e2285> DW_AT_low_pc : 0
<f4e228d> DW_AT_high_pc : 0x8
<f4e2291> DW_AT_call_file : 5
<f4e2292> DW_AT_call_line : 351
<f4e2294> DW_AT_call_column : 4
<10><f4e2295>: Abbrev Number: 4 (DW_TAG_inlined_subroutine)
<f4e2296> DW_AT_abstract_origin: <0xf4dc552>
--
000175a0 0000000004e58dbb 0000000004e58dc2
000175a0 0000000004e58dcc 0000000004e58dd3
000175a0 <End of list>
000175d0 0000000000000001 0000000000000001 (start == end)
000175d0 0000000004e31530 0000000004e3165e
000175d0 0000000004e31660 0000000004e316f8
000175d0 0000000004e31700 0000000004e317a0
000175d0 0000000004e317a0 0000000004e318bb
000175d0 0000000004e318c0 0000000004e318e8
000175d0 0000000004e31900 0000000004e31980
000175d0 0000000004e31980 0000000004e31a92
000175d0 0000000004e31aa0 0000000004e31bb2
000175d0 0000000004e31bc0 0000000004e31cd2
000175d0 0000000004e31ce0 0000000004e31df2
000175d0 0000000004e31e00 0000000004e31f12
000175d0 0000000004e31f20 0000000004e32032
000175d0 0000000004e32040 0000000004e32152
--
04e314d0 <End of list>
04e31500 0000000000000001 0000000000000001 (start == end)
04e31500 0000000000000001 0000000000000001 (start == end)
04e31500 <End of list>
04e31530 0000000000000001 0000000000000001 (start == end)
04e31530 0000000000000001 0000000000000001 (start == end)
04e31530 <End of list>
04e31560 0000000000000001 0000000000000001 (start == end)
04e31560 0000000000000001 0000000000000001 (start == end)
04e31560 <End of list>
04e31590 0000000000000001 0000000000000001 (start == end)
04e31590 0000000000000001 0000000000000001 (start == end)
04e31590 <End of list>
04e315c0 0000000000000001 0000000000000001 (start == end)
04e315c0 0000000000000001 0000000000000001 (start == end)
04e315c0 <End of list>
04e315f0 0000000000000001 0000000000000001 (start == end)
04e315f0 0000000000000001 0000000000000001 (start == end)
04e315f0 <End of list>
--
[0x00000a8a] Special opcode 117: advance Address by 8 to 0xe9 and Line by 0 to 0
[0x00000a8b] Advance PC by 8 to 0xf1
[0x00000a8d] Extended opcode 1: End of Sequence
[0x00000a90] Extended opcode 2: set Address to 0x4e31530
[0x00000a9b] Advance Line by 1262 to 1263
[0x00000a9e] Copy
[0x00000a9f] Set column to 79
[0x00000aa1] Set prologue_end to true
[0x00000aa2] Advance PC by constant 17 to 0x4e31541
[0x00000aa3] Special opcode 187: advance Address by 13 to 0x4e3154e and Line by 0 to 1263
[0x00000aa4] Set File Name to entry 4 in the File Name Table
[0x00000aa6] Set column to 9
[0x00000aa8] Advance Line by -802 to 461
[0x00000aab] Special opcode 61: advance Address by 4 to 0x4e31552 and Line by 0 to 461
[0x00000aac] Set column to 21
[0x00000aae] Set is_stmt to 0
0x000000c9: DW_TAG_subprogram
DW_AT_low_pc (0x0000000004e31530)
DW_AT_high_pc (0x0000000004e3165e)
DW_AT_GNU_all_call_sites (true)
DW_AT_name ("LogStream")
0x000000da: DW_TAG_inlined_subroutine
DW_AT_abstract_origin (0x000000bf "basic_ios")
DW_AT_ranges (0x00000000
[0x0000000004e31552, 0x0000000004e3157a)
[0x0000000004e315ce, 0x0000000004e315d2))
DW_AT_call_file ("/proc/self/cwd/bazel-out/k8-opt/bin/external/glog/_virtual_includes/glog/glog/logging.h")
DW_AT_call_line (1262)
DW_AT_call_column (0x05)
Thanks, that's very helpful. So the problem is that DW_AT_linkage_name
doesn't exist. You could try deleting this code but I'm not sure if that is enough. It'll be a day or two before I can work on a proper fix.
There seems no DW_AT_linkage_name
; while DW_AT_name
does exist and is not enough.
➜ shm nm /path/to/main | grep 4e48cd0
0000000004e48cd0 W _ZNSt10_HashtableINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES5_SaIS5_ENSt8__detail9_IdentityESt8equal_toIS5_ESt4hashIS5_ENS7_18_Mod_range_hashingENS7_20_Default_ranged_hashENS7_20_Prime_rehash_policyENS7_17_Hashtable_traitsILb1ELb1ELb1EEEE9_M_insertIRKS5_NS7_10_AllocNodeISaINS7_10_Hash_nodeIS5_Lb1EEEEEEEESt4pairINS7_14_Node_iteratorIS5_Lb1ELb1EEEbEOT_RKT0_St17integral_constantIbLb1EEm
➜ shm nm -C /path/to/main | grep 4e48cd0
0000000004e48cd0 W std::pair<std::__detail::_Node_iterator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true, true>, bool> std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_insert<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true> > > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true> > > const&, std::integral_constant<bool, true>, unsigned long)
➜ shm LANG=C.UTF-8 objdump -W /home/gdh1995/sync/walle3/.cache/68cce18e346283f1d454605f04e495f3/execroot/autocar/bazel-out/k8-opt/bin/common/tools/simulation/simulation_main | grep -B4 -A12 4e48cd0
DW_CFA_nop
DW_CFA_nop
DW_CFA_nop
00742598 0000000000000034 00001fcc FDE cie=007405d0 pc=0000000004e48cd0..0000000004e48e16
Augmentation data: fb 83 04 ff
DW_CFA_advance_loc: 1 to 0000000004e48cd1
DW_CFA_def_cfa_offset: 16
DW_CFA_offset: r6 (rbp) at cfa-16
DW_CFA_advance_loc: 3 to 0000000004e48cd4
DW_CFA_def_cfa_register: r6 (rbp)
DW_CFA_advance_loc: 13 to 0000000004e48ce1
DW_CFA_offset: r3 (rbx) at cfa-56
DW_CFA_offset: r12 (r12) at cfa-48
DW_CFA_offset: r13 (r13) at cfa-40
DW_CFA_offset: r14 (r14) at cfa-32
DW_CFA_offset: r15 (r15) at cfa-24
--
<3288f> DW_AT_abstract_origin: <0x3351>
<32893> DW_AT_low_pc : 0x4e48cc5
<2><3289b>: Abbrev Number: 0
<1><3289c>: Abbrev Number: 3 (DW_TAG_subprogram)
<3289d> DW_AT_low_pc : 0x4e48cd0
<328a5> DW_AT_high_pc : 0x146
<328a9> DW_AT_GNU_all_call_sites: 1
<328a9> DW_AT_name : (indirect string, offset: 0x48b793): _M_insert<const std::basic_string<char> &, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::basic_string<char>, true> > > >
<2><328ad>: Abbrev Number: 7 (DW_TAG_inlined_subroutine)
<328ae> DW_AT_abstract_origin: <0x32a64>
<328b2> DW_AT_ranges : 0x11500
<328b6> DW_AT_call_file : 21
<328b7> DW_AT_call_line : 1845
<328b9> DW_AT_call_column : 29
<3><328ba>: Abbrev Number: 7 (DW_TAG_inlined_subroutine)
<328bb> DW_AT_abstract_origin: <0x2d85f>
<328bf> DW_AT_ranges : 0x11530
Thank you very much. Deleting those lines does help a lot:
➜ addr2line git:(master) cargo-1.76 build --bin addr2line --no-default-features --features bin && </wo/pprof-debug-rs-demangling.sym ~/github/addr2line/target/debug/addr2line -f -e /path/to/main -a -i -C
Finished dev [unoptimized + debuginfo] target(s) in 0.04s
0x0000000004e31530
demangle: "_ZN6google10LogMessage9LogStreamC1EPcil"
google::LogMessage::LogStream::LogStream(char*, int, long)
/proc/self/cwd/bazel-out/k8-opt/bin/external/glog/_virtual_includes/glog/glog/logging.h:1263
0x0000000004e31559
demangle: "basic_ios"
basic_ios
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/basic_ios.h:461
demangle: "_ZN6google10LogMessage9LogStreamC1EPcil"
google::LogMessage::LogStream::LogStream(char*, int, long)
/proc/self/cwd/bazel-out/k8-opt/bin/external/glog/_virtual_includes/glog/glog/logging.h:1262
0x0000000004e48cd0
demangle: "_ZNSt10_HashtableINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES5_SaIS5_ENSt8__detail9_IdentityESt8equal_toIS5_ESt4hashIS5_ENS7_18_Mod_range_hashingENS7_20_Default_ranged_hashENS7_20_Prime_rehash_policyENS7_17_Hashtable_traitsILb1ELb1ELb1EEEE9_M_insertIRKS5_NS7_10_AllocNodeISaINS7_10_Hash_nodeIS5_Lb1EEEEEEEESt4pairINS7_14_Node_iteratorIS5_Lb1ELb1EEEbEOT_RKT0_St17integral_constantIbLb1EEm"
std::pair<std::__detail::_Node_iterator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true, true>, bool> std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_insert<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true> > > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true> > > const&, std::integral_constant<bool, true>, unsigned long)
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/hashtable.h:1843
There's a few more instances of DW_AT_name
in that file that could be deleted as well then.
So, deleting those lines will make frames.function.name
be None
, and then in if opts.do_functions
enter the branch oflet name = ctx.find_symbol(probe);
.
I'm not sure when to accept the DW_AT_name
value and when not. It's all up to you.
Thanks again!
Probably what we need to do is change gimli to treat 0 as a tombstone address, so that it skips rows until the next valid address.
FYI: nm
the buggy address in the 1st comment shows 000000000bdbb6d0 t arena_decay_deadline_init
, and it:
jemalloc/src/arena.c
gcc
and linked by lld
of llvm-12
But, other functions of jemalloc
won't cause this addr2line
crashes.
There's a few more instances of
DW_AT_name
in that file that could be deleted as well then.
DW_AT_name
items in src/function.rs
, and then the output file gets much more different from the one llvm-addr2line-18
outputs (a lot of long symbol names become bare function names).fn name_entry<R>
, then the output is most similar, although there're still a few function names lacking namespace and argument names.And the speed difference is:
code | speed (sec) |
---|---|
master + 0-as-tomb | 0.77 |
(above) + remove in Function::parse |
0.94 |
(above) + remove in InlinedFunction::parse |
0.99 |
(above) + remove in name_entry::parse |
1.37 |
Ignoring DW_AT_name
completely is wrong, since it's normal to not have a linkage name for C functions.
To help figure out a solution, can you give more information on reproducing this? Which compiler and flags are you using? The DWARF should have a copy of this, such as:
0x0000000b: DW_TAG_compile_unit
DW_AT_producer ("GNU C++17 11.4.0 -mtune=generic -march=x86-64 -gdwarf-4 -fno-strict-aliasing -fPIC -fasynchronous-unwind-tables -fstack-protector-strong -fstack-clash-protection -fcf-protection")
DW_AT_language (DW_LANG_C_plus_plus)
clang-8
for the aarch64 platform, Nvidia Orin Drive system.
I'll take a try to make a minimal case in 1-2 days.
Sorry the info above is incomplete. In fact I met this problem on both aarch64 and x86-64, and on the x86-64 (ubuntu 20.04 docker), it is:
0x0010dbb4: DW_TAG_compile_unit
DW_AT_producer ("Ubuntu clang version 12.0.0-3ubuntu1~20.04.5")
DW_AT_language (DW_LANG_C_plus_plus_14)
A minimal case would be useful. I tried clang 12.0.0 on godbolt but it includes the linkage name: https://godbolt.org/z/M1oWrhn7b
Hello, here's a minimal case:
// a.cc
struct Foo {
Foo();
int a;
};
Foo::Foo() { a = 1; }
struct Bar : public Foo {
Bar();
};
Bar::Bar() {}
➜ t clang-18 -g1 -O2 -shared a.cc -o a.out
➜ t ~/github/addr2line/target/release/addr2line -afiC -e a.out <<< `nm a.out | grep -m1 Bar | awk '{print $1}'`
0x0000000000001110
Foo
/wo/t/a.cc:6
Bar
/wo/t/a.cc:12
➜ t clang-18 -g1 -O1 -shared a.cc -o a.out
➜ t ~/github/addr2line/target/release/addr2line -afiC -e a.out <<< `nm a.out | grep -m1 Bar | awk '{print $1}'`
0x0000000000001110
Foo
/wo/t/a.cc:6
Bar
/wo/t/a.cc:12
➜ t clang-18 -g1 -O0 -shared a.cc -o a.out
➜ t ~/github/addr2line/target/release/addr2line -afiC -e a.out <<< `nm a.out | grep -m1 Bar | awk '{print $1}'`
0x0000000000001130
Bar::Bar()
/wo/t/a.cc:12
addr2line
is built by cargo-1.76 build --release --bin addr2line --no-default-features --features bin
.2024-08-06 15:23 48f4734 origin/master, origin/HEAD Grouped bar chart graph (#325)
➜ t clang-18 --version
Ubuntu clang version 18.1.3 (1ubuntu1)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
➜ t cat /etc/os-release
PRETTY_NAME="Ubuntu 24.04 LTS"
...
Thanks, I can reproduce that now. The significant factor is the -g1
option to clang. It works fine with -g
. In my opinion this is a clang bug (gcc outputs the linkage name for this). For example, this is the llvm-addr2line
output with -g1
:
0x1110
Foo
/tmp/a.cc:6
Bar::Bar()
/tmp/a.cc:12
but if you compile with -g
then the llvm-addr2line
output is:
0x1110
Foo::Foo()
/tmp/a.cc:6
Bar::Bar()
/tmp/a.cc:12
Notice how llvm-addr2line
can't get the right name for Foo::Foo()
with -g1
. It can't because clang isn't including it in the DWARF.
I'll think about whether there's a good way to fix this, but I think you should switch to compiling with the correct DWARF information.
You said wrong ... it's normal to not have a linkage name for C functions.
Then I wonder why it's "wrong" - C function names are simple enough and those in symbol tables should have been enough - am I right?
The symbol table does not have inlined functions, so we need either DW_AT_name
or DW_AT_linkage_name
for those cases. C functions don't need DW_AT_linkage_name
, so we must use DW_AT_name
.
Additionally, I still think that if the DWARF is correct, then the names it gives are better than the symbol table (mentioned in #324). The problem is that clang is generating DWARF that is not correct.
What about querying all of AT_name
, linkage_name
and also symbol table
, and then picking up a longest
name?
clang
tends to generate "somehow shorter" DWARF info which may be obviously different with GCC
.clang-12
to clang-18
this behavior keep unchanged, the conclusion is that many clang-compiler developers don't think this is buggyIt's fine if clang
generates shorter DWARF, but the DWARF it generates is missing information. That is a bug. Specifically, it is missing the name of the inlined function Foo::Foo()
.
I don't see any opinions in https://github.com/libunwind/libunwind/pull/794 about shorter DWARF. Is that the correct link?
Um, the link is about:
add x29, sp, int_offset
and before ldp x29, x30, [sp, int_offset]
gperftools -> libunwind
may fail to get correct next frames, instead it just interprets dirty data as function addresseslibunwind
may crash because of segment errors.This occurs on clang + aarch64
platform, while on x86-64 clang-generated DWARF FDE is complete.
Ah sorry, I misread and was expecting someone from the LLVM project to be giving an opinion that this was intentional. It could easily be an oversight instead. I think -g1
is less commonly used.
What about querying all of AT_name, linkage_name and also symbol table, and then picking up a longest name?
That's not reliable behaviour, and doesn't address my concern.
The simplest fix is to do what llvm-addr2line does, which is to give preference to the symbol table. I don't see any way to better than that for the DWARF generated by clang -g1
. The problem is that in other rare cases (discontiguous function ranages), that gives a worse result for the DWARF generated by clang -g
and gcc
.
I will look into this more when I have time, but I don't consider it a high priority (since the result with clang -g1
will still be wrong for inlined functions), and it will take considerable time to test this properly to see how much it affects cases other than clang -g1
.
The problem is that in other rare cases (discontiguous function ranages)
Oh I see.
Anyway, the current has been enough to work with gperftools
and libunwind
to draw a function-call flame graph.
Thanks a lot for this project and your patience ^_^.
However, when I ran it with a non-stripped x86-64 ELF file (from clang-12), it crashed with InvalidAddressRange, and the below code always reported Failed to get the next row: InvalidAddressRange The end of an address range must not be before the beginning..
I still haven't managed to reproduce this. I'm sure that treating this as a tombstone is the right thing to do, but it would still be nice to test it in practice. The conditions to be able to reproduce this are:
DW_LNE_set_address
instructions in a single sequence. I can't get clang or gcc to do this. Using -ffunction-sections
gives multiple DW_LNE_set_address
instructions, but they are in different sequences.-ffunction-sections -Wl,--gc-sections
, but that's not enough due to the previous condition.// a.c
struct foo_st {
int member;
};
void handle_bar(void *bar);
void foo_method1(struct foo_st *foo) {
handle_bar(&foo->member);
}
void foo_method2() {
}
And tests:
gcc-9 -g -O3 -c -o a.o a.c
# or: gcc-9 -g -O2 -c -o a.o a.c
/home/gdh1995/github/addr2line/target/debug/addr2line -e a.o <<< 0x10
# or: /home/gdh1995/github/addr2line/target/debug/addr2line -e a.o <<< 0x0
And gcc-9 (Ubuntu 9.5.0-6ubuntu2) 9.5.0
on Ubuntu 24.04 LTS.
Thanks for your continued assistance with this issue.
I'm not sure that is the same problem that was first reported in this issue. That problem is because this crate doesn't handle relocations in object files. If I link that object file into a executable then it works fine:
// main.c
struct foo_st;
void foo_method1(struct foo_st *foo);
void handle_bar(void *bar) {}
int main() {
foo_method1(0);
return 0;
}
gcc-9 a.o main.c -o main
target/debug/addr2line -e main <<< `nm main | grep foo_method2 | cut -d ' ' -f1`
Was your original report for an executable file or a relocatable object file? I can fix it to work for relocatable object files by processing relocations, but I doubt that is what you need, and I doubt it will fix the original issue.
Um, the original code is from jemalloc/src/arena.o, and it's in
libjemalloc.a` and then statically linked into my target executable file.
The original address is captured by gperftools -> libunwind
and it's in arena_bin_lower_slab.isra
, and when I tried addr2line -e my-exe.out <<<`nm my-exe.out | grep arena_bin_lower_slab | awk '{print $1}'`
it also failed (the input address is the first inst of the function).
If I replaced the check with || == 0
, then it showed:
0x000000000bfe6208
extent_sn_get
/xxx/.cache/b815fe69052cafbe84b6b4f1f30da8c2/execroot/xxx/external/jemalloc-local/include/jemalloc/internal/extent_inlines.h:84
extent_sn_comp
/xxx/.cache/b815fe69052cafbe84b6b4f1f30da8c2/execroot/xxx/external/jemalloc-local/include/jemalloc/internal/extent_inlines.h:446
extent_snad_comp
/xxx/.cache/b815fe69052cafbe84b6b4f1f30da8c2/execroot/xxx/external/jemalloc-local/include/jemalloc/internal/extent_inlines.h:479
arena_bin_lower_slab
/xxx/.cache/b815fe69052cafbe84b6b4f1f30da8c2/execroot/xxx/external/jemalloc-local/src/arena.c:1669
arena_bin_lower_slab.isra
>
That .isra
suffix might be a clue for reproducing this. It sounds like -fipa-sra
, which is the sort of thing that might leave a tombstone behind. That gives me something to work on.
It seems not only .isra
functions.
# llvm-nm-8 is much faster than binutils `nm` and even `llvm-nm-12`
IFS=$'\n' a=(`llvm-nm-8 bazel-bin/xxx_main | grep -P '\barena_' | sort`); unset IFS
for i in $a; do
echo '>>> '$i
target/release/addr2line -e bazel-bin/xxx_main <<<${i%% *} 2>&1 | grep -P 'InvalidAddressRange|/xxx'
done
>>> 000000000b64ded0 T arena_initslow
>>> 000000000bde9c90 t arena_dalloc_large_no_tcache.isra.0
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/include/jemalloc/internal/arena_inlines_b.h:232
>>> 000000000bde9e80 t arena_dalloc_no_tcache
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/include/jemalloc/internal/arena_inlines_b.h:242
>>> 000000000bdee740 t arena_choose_impl.constprop.0
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/include/jemalloc/internal/jemalloc_internal_inlines_b.h:8
>>> 000000000bdf7660 t arena_decay_deadline_init
called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf7780 t arena_dalloc_junk_small_impl
called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf77b0 t arena_bin_lower_slab.isra.0
called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf78a0 t arena_decay_to_limit
called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf7c10 t arena_maybe_decay
called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf80a0 t arena_decay_ms_set.part.0
called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf81d0 t arena_decay_impl
called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf8370 t arena_slab_dalloc
called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf8460 t arena_bin_malloc_hard
called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdff170 t arena_decay_compute_purge_interval_impl.part.0
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/background_thread.c:116
>>> 000000000be0a110 t arena_i_dirty_decay_ms_ctl
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2358
>>> 000000000be0a840 t arena_i_muzzy_decay_ms_ctl
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2365
>>> 000000000be0acd0 t arena_i_index
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2464
>>> 000000000be0ae30 t arena_i_retain_grow_limit_ctl
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2428
>>> 000000000be10860 t arena_i_decay
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2056
>>> 000000000be10ce0 t arena_i_purge_ctl
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2120
>>> 000000000be10d30 t arena_i_decay_ctl
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2104
>>> 000000000be11290 t arena_i_dss_ctl
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2249
>>> 000000000be11bb0 t arena_i_extent_hooks_ctl
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2372
>>> 000000000be24fa0 t arena_i_initialized_ctl
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2035
>>> 000000000be28ef0 t arena_i_reset_ctl
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2188
>>> 000000000be2a590 t arena_i_destroy_ctl
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/src/ctl.c:2208
>>> 000000000be3ad40 t arena_dalloc_no_tcache
/xxx/.cache/68cce18e346283f1d454605f04e495f3/execroot/xxx/external/jemalloc-local/include/jemalloc/internal/arena_inlines_b.h:242
>>> 000000000be85d60 d arena_node
>>> 000000000be85dc0 d arena_i_node
>>> 000000000c141c40 b arena_binind_div_info
Seems about arena
: All InvalidAddressRange
lines below can be parsed by the version of || == 0
and show external/jemalloc-local/src/arena.c:***
>>> 000000000bdec260 t je_malloc_initialized => external/jemalloc-local/src/jemalloc.c:209
>>> 000000000bdec270 t je_a0dalloc => external/jemalloc-local/src/jemalloc.c:255
>>> 000000000bdec280 t je_bootstrap_free => external/jemalloc-local/src/jemalloc.c:288
>>> 000000000bdec2a0 t je_arena_set => external/jemalloc-local/src/jemalloc.c:297
>>> 000000000bdec2c0 t je_narenas_total_get => external/jemalloc-local/src/jemalloc.c:312
>>> 000000000bdec2d0 t je_arena_init => external/jemalloc-local/src/jemalloc.c:364
>>> 000000000bdecc70 t je_jemalloc_prefork => external/jemalloc-local/src/jemalloc.c:3786
>>> 000000000bdecdf0 t je_jemalloc_postfork_parent => external/jemalloc-local/src/jemalloc.c:3860
>>> 000000000bded020 t je_jemalloc_postfork_child => external/jemalloc-local/src/jemalloc.c:3894
>>> 000000000bded920 t je_a0malloc => external/jemalloc-local/src/jemalloc.c:250
>>> 000000000bded930 t je_bootstrap_malloc => external/jemalloc-local/src/jemalloc.c:266
>>> 000000000bded950 t je_bootstrap_calloc => external/jemalloc-local/src/jemalloc.c:275
>>> 000000000bded970 t je_arena_migrate => external/jemalloc-local/src/jemalloc.c:397
>>> 000000000bded9b0 t je_arena_tdata_get_hard => external/jemalloc-local/src/jemalloc.c:422
>>> 000000000bdee0f0 t je_arena_choose_hard => external/jemalloc-local/src/jemalloc.c:499
>>> 000000000bdee910 t je_iarena_cleanup => external/jemalloc-local/src/jemalloc.c:613
>>> 000000000bdee960 t je_arena_cleanup => external/jemalloc-local/src/jemalloc.c:623
>>> 000000000bdee9b0 t je_arenas_tdata_cleanup => external/jemalloc-local/src/jemalloc.c:633
>>> 000000000bdee9f0 t je_malloc_default => external/jemalloc-local/src/jemalloc.c:2271
>>> 000000000bdef870 T __je_malloc => external/jemalloc-local/src/jemalloc.c:2323
>>> 000000000bdef940 T __je_posix_memalign => external/jemalloc-local/src/jemalloc.c:2393
>>> 000000000bdefdd0 T __je_aligned_alloc => external/jemalloc-local/src/jemalloc.c:2433
>>> 000000000bdf0250 T __je_calloc => external/jemalloc-local/src/jemalloc.c:2474
>>> 000000000bdf1050 T __je_realloc => external/jemalloc-local/src/jemalloc.c:2653
>>> 000000000bdf22b0 t je_free_default => external/jemalloc-local/src/jemalloc.c:2771
>>> 000000000bdf2b40 T __je_free => external/jemalloc-local/src/jemalloc.c:2863
>>> 000000000bdf2c10 T __je_memalign => external/jemalloc-local/src/jemalloc.c:2885
>>> 000000000bdf3050 T __je_valloc => external/jemalloc-local/src/jemalloc.c:2924
>>> 000000000bdf32b0 T __je_mallocx => external/jemalloc-local/src/jemalloc.c:3098
>>> 000000000bdf4910 T __je_rallocx => external/jemalloc-local/src/jemalloc.c:3216
>>> 000000000bdf5530 T __je_xallocx => external/jemalloc-local/src/jemalloc.c:3390
>>> 000000000bdf5850 T __je_sallocx => external/jemalloc-local/src/jemalloc.c:3461
>>> 000000000bdf5a80 T __je_dallocx => external/jemalloc-local/src/jemalloc.c:3487
>>> 000000000bdf6330 t je_sdallocx_default => external/jemalloc-local/src/jemalloc.c:3548
>>> 000000000bdf6de0 T __je_sdallocx => external/jemalloc-local/src/jemalloc.c:3594
>>> 000000000bdf6eb0 t je_je_sdallocx_noflags => external/jemalloc-local/src/jemalloc.c:3606
>>> 000000000bdf6f80 T __je_nallocx => external/jemalloc-local/src/jemalloc.c:3619
>>> 000000000bdf71b0 T __je_mallctl => external/jemalloc-local/src/jemalloc.c:3646
>>> 000000000bdf7260 T __je_mallctlnametomib => external/jemalloc-local/src/jemalloc.c:3667
>>> 000000000bdf72f0 T __je_mallctlbymib => external/jemalloc-local/src/jemalloc.c:3688
>>> 000000000bdf7390 T __je_malloc_stats_print => external/jemalloc-local/src/jemalloc.c:3709
>>> 000000000bdf73f0 T __je_malloc_usable_size => external/jemalloc-local/src/jemalloc.c:3722
>>> 000000000bdf8aa0 t je_arena_basic_stats_merge => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf8b20 t je_arena_stats_merge => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf9aa0 t je_arena_extents_dirty_dalloc => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf9b60 t je_arena_extent_alloc_large => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf9ea0 t je_arena_extent_dalloc_large_prep => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdf9f70 t je_arena_extent_ralloc_large_shrink => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfa0d0 t je_arena_extent_ralloc_large_expand => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfa230 t je_arena_dirty_decay_ms_get => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfa240 t je_arena_muzzy_decay_ms_get => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfa250 t je_arena_dirty_decay_ms_set => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfa2a0 t je_arena_muzzy_decay_ms_set => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfa2f0 t je_arena_decay => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfa4d0 t je_arena_reset => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfab30 t je_arena_destroy => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfabe0 t je_arena_bin_choose_lock => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfac70 t je_arena_tcache_fill_small => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfb260 t je_arena_alloc_junk_small => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfb280 t je_arena_dalloc_bin_junked_locked => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfb4c0 t je_arena_dalloc_small => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfbac0 t je_arena_ralloc_no_move => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc050 t je_arena_dss_prec_get => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc060 t je_arena_dss_prec_set => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc070 t je_arena_dirty_decay_ms_default_get => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc080 t je_arena_dirty_decay_ms_default_set => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc0c0 t je_arena_muzzy_decay_ms_default_get => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc0d0 t je_arena_muzzy_decay_ms_default_set => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc110 t je_arena_retain_grow_limit_get_set => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc260 t je_arena_nthreads_get => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc270 t je_arena_nthreads_inc => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc280 t je_arena_nthreads_dec => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc290 t je_arena_extent_sn_next => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfc2b0 t je_arena_new => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfcab0 t je_arena_choose_huge => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfcb80 t je_arena_malloc_hard => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfd180 t je_arena_palloc => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfd610 t je_arena_ralloc => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfeb40 t je_arena_init_huge => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfebb0 t je_arena_is_huge => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfebd0 t je_arena_boot => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfed40 t je_arena_prefork0 => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfed80 t je_arena_prefork1 => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfed90 t je_arena_prefork2 => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfeda0 t je_arena_prefork3 => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfede0 t je_arena_prefork4 => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfedf0 t je_arena_prefork5 => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfee00 t je_arena_prefork6 => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfee10 t je_arena_prefork7 => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfee80 t je_arena_postfork_parent => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000bdfef90 t je_arena_postfork_child => called `Result::unwrap()` on an `Err` value: InvalidAddressRange
>>> 000000000be014c0 t je_pthread_create_wrapper => external/jemalloc-local/src/background_thread.c:45
>>> 000000000be014d0 t je_background_thread_create => external/jemalloc-local/src/background_thread.c:593
>>> 000000000be01560 t je_background_threads_enable => external/jemalloc-local/src/background_thread.c:605
>>> 000000000be01c10 t je_background_threads_disable => external/jemalloc-local/src/background_thread.c:641
>>> 000000000be01d80 t je_background_thread_interval_check => external/jemalloc-local/src/background_thread.c:658
Having a bunch of failures grouped like that is expected because a single tombstone mid sequence will cause the line program for the entire compilation unit to be ignored (in this case arena.c
). It may be possible to figure out which function the tombstone is for, but I think you'd have to do it by comparing the line program in the executable with the line program in arena.o
, and I'm not sure if knowing that will help me reproduce it anyway.
I've looked into the isra
a bit more and I don't think it's relevant. I get the same symbol if I compile jemalloc but I don't see any tombstones in programs linked with it.
Can you tell me which linker is being used?
clang-12 and ld.lld-12
Hello I'm searching for a faster replacement for the Binutils addr2line, and this project is incredible fast (about 1 sec vs. 12min) for my program. Many thanks!
Crash details
However, when I ran it with a non-stripped x86-64 ELF file (from clang-12), it crashed with
InvalidAddressRange
, and the below code always reportedFailed to get the next row: InvalidAddressRange The end of an address range must not be before the beginning.
.Then I cloned
gimli
, compiled it with debugging code and gettest: addr 2: 0 vs. 4
:BTW, this
addr2line
worked well when providing a same input address (0x000000000bdbb6d0
) while ax86_64-linux-gnu-strip -s
-ed ELF file.Expected
I want to use this tool to decode tons of addresses which are recorded by gperftools CPU profiling and passed to
addr2line
by thepprof
Perl script.So, might this project add an option to ignore such DWARF errors and continue to parse and output other input addresses?