Open oltolm opened 2 days ago
@llvm/issue-subscribers-tools-llvm-dwarfdump
Author: oltolm (oltolm)
A similar issue was reported in #56342.
Not sure I follow - the raw verbose output is just as wrong as the non-verbose output - the result after the =>
is the computed result, which lacks the address because the adresss pool isn't accessible.
It's the nature of DWARF - we can't print .debug_rnglists correctly when examined in isolation, because we don't know which address pool (.debug_addr) to use, without parsing all the .debug_info contributions (at least their first DIEs).
The way llvm-dwarfdump handles this is that if you dump .debug_info as well, it'll record that and use it for dumping other sections - if you don't, we don't, and print things out as though every address in the address pool was 0 (this looks roughly like what happens when you dump an intermediate object file, since the addreses aren't resolved at that point)
eg:
$ clang++-tot test.cpp -g -ffunction-sections && llvm-dwarfdump-tot a.out --debug-rnglists -v -debug-info
a.out: file format elf64-x86-64
.debug_info contents:
0x00000000: Compile Unit: length = 0x00000053, format = DWARF32, version = 0x0005, unit_type = DW_UT_compile, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000057)
0x0000000c: DW_TAG_compile_unit [1] *
DW_AT_producer [DW_FORM_strx1] (indexed (00000000) string = "clang version 20.0.0git (git@github.com:llvm/llvm-project.git 8dd9f206b518a97132f3e2489ccc93704e638353)")
DW_AT_language [DW_FORM_data2] (DW_LANG_C_plus_plus_14)
DW_AT_name [DW_FORM_strx1] (indexed (00000001) string = "test.cpp")
DW_AT_str_offsets_base [DW_FORM_sec_offset] (0x00000008)
DW_AT_stmt_list [DW_FORM_sec_offset] (0x00000000)
DW_AT_comp_dir [DW_FORM_strx1] (indexed (00000002) string = "/usr/local/google/home/blaikie/dev/scratch")
DW_AT_low_pc [DW_FORM_addr] (0x0000000000000000)
DW_AT_ranges [DW_FORM_rnglistx] (indexed (0x0) rangelist = 0x00000010
[0x0000000000001130, 0x0000000000001136)
[0x0000000000001140, 0x0000000000001146)
[0x0000000000001150, 0x0000000000001158))
DW_AT_addr_base [DW_FORM_sec_offset] (0x00000008)
DW_AT_rnglists_base [DW_FORM_sec_offset] (0x0000000c)
...
.debug_rnglists contents:
0x00000000: range list header: length = 0x00000016, format = DWARF32, version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000001
offsets: [
0x00000004 => 0x00000010
]
ranges:
0x00000010: [DW_RLE_startx_length]: 0x0000000000000000, 0x0000000000000006 => [0x0000000000001130, 0x0000000000001136)
0x00000013: [DW_RLE_startx_length]: 0x0000000000000001, 0x0000000000000006 => [0x0000000000001140, 0x0000000000001146)
0x00000016: [DW_RLE_startx_length]: 0x0000000000000002, 0x0000000000000008 => [0x0000000000001150, 0x0000000000001158)
0x00000019: [DW_RLE_end_of_list ]
h$ clang++-tot test.cpp -g -ffunction-sections && llvm-dwarfdump-tot a.out --debug-rnglists -v
a.out: file format elf64-x86-64
.debug_rnglists contents:
0x00000000: range list header: length = 0x00000016, format = DWARF32, version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000001
offsets: [
0x00000004 => 0x00000010
]
ranges:
0x00000010: [DW_RLE_startx_length]: 0x0000000000000000, 0x0000000000000006 => [0x0000000000000000, 0x0000000000000006)
0x00000013: [DW_RLE_startx_length]: 0x0000000000000001, 0x0000000000000006 => [0x0000000000000000, 0x0000000000000006)
0x00000016: [DW_RLE_startx_length]: 0x0000000000000002, 0x0000000000000008 => [0x0000000000000000, 0x0000000000000008)
0x00000019: [DW_RLE_end_of_list ]
I'd be open to somee documentation patch, perhaps some part of the output could clarify that it's not accurate/the address index is unresolved. (could print out the address ranges as addrx[0]+0x0, addrx[0]+0x6
, etc... but that might be a bit tedious/repetitive)
I don't understand why you call the raw output wrong. At least I can use it to manually calculate the result.
I think your example only works by accident. There are multiple problems here:
case dwarf::DW_RLE_base_addressx: {
if (auto SA = LookupPooledAddress(Value0))
CurrentBase = SA->Address;
else
CurrentBase = Value0;
if (!DumpOpts.Verbose)
return;
DWARFFormValue::dumpAddress(OS << ' ', AddrSize, Value0);
break;
}
If it can not lookup the address it uses the offset into .debug_addr
as address base. I think this is wrong, but more importantly LookupPooledAddress
looks up the address in the first CU:
auto LookupPooledAddress =
[&](uint32_t Index) -> std::optional<SectionedAddress> {
const auto &CUs = compile_units();
auto I = CUs.begin();
if (I == CUs.end())
return std::nullopt;
return (*I)->getAddrOffsetSectionItem(Index);
};
but there is no guarantee that it belongs to the first CU. In my example the first CU does not have an address base, but even if it did, it could be the wrong CU.
Another problem
case dwarf::DW_RLE_offset_pair:
PrintRawEntry(OS, *this, AddrSize, DumpOpts);
if (CurrentBase != Tombstone)
DWARFAddressRange(Value0 + CurrentBase, Value1 + CurrentBase)
.dump(OS, AddrSize, DumpOpts);
else
OS << "dead code";
break;
If DW_RLE_offset_pair
is the first entry then there is no base address and it can not calculate the result even if you pass --debug-info
, like in this example:
.debug_rnglists contents:
0x00000000: range list header: length = 0x0000002c, format = DWARF32, version = 0x0005, addr_size = 0x08, seg_size = 0x00, offset_entry_count = 0x00000000
ranges:
0x0000000c: [DW_RLE_offset_pair]: 0x0000000000000014, 0x000000000000005a => [0x0000000000000014, 0x000000000000005a)
0x0000000f: [DW_RLE_offset_pair]: 0x00000000000000c0, 0x00000000000000f8 => [0x00000000000000c0, 0x00000000000000f8)
0x00000014: [DW_RLE_offset_pair]: 0x0000000000000110, 0x000000000000012e => [0x0000000000000110, 0x000000000000012e)
0x00000019: [DW_RLE_end_of_list]
I created an exe with DWARF5 with Clang and dumped it with
llvm-dwarfdump
This is the incorrect result:
The correct output is produced inline in
.debug_info
output:the correct raw output is produced when passing
--verbose
The wrong output is neither the raw output nor the one from
.debug_info
. I could contribute a fix that always outputs the raw contents of.debug_rnglists
. What do you think?