gimli-rs / gimli

A library for reading and writing the DWARF debugging format
https://docs.rs/gimli/
Apache License 2.0
853 stars 108 forks source link

[DWARF-5] Ambiguity when reading DW_FORM_sec_offset #475

Closed ggreif closed 4 years ago

ggreif commented 4 years ago

DWARF-5 spec mentions (in 7.5.5 Classes and Forms, p. 212, l. 6) that "DW_FORM_sec_offset is a member of more than one class". It can format the attributes DW_AT_addr_base, DW_AT_stmt_list and DW_AT_ranges, for example.

However when reading with gimli, it always seems to get resolved as AttributeValue::DebugAddrBase, regardless of the DW_AT_*.

philipc commented 4 years ago

It isn't always AttributeValue::DebugAddrBase. We have tests for some of these cases. From reading the code DW_AT_stmt_list and DW_AT_ranges should be handled correctly and since these are such common attributes I would be surprised if they are wrong. Can you be more specific about which DW_AT value it is getting wrong or provide a test case?

ggreif commented 4 years ago

@philipc I am seeing AttributeValue::DebugAddrBase arriving in wasmtime. It originates from DW_AT_stmt_list with DW_FORM_sec_offset. So it might be an issue with wasmtime or me (generating bad DWARF?). Thanks for your comment, I'll look into it more closely and come back.

This is a piece of code I came across: https://github.com/gimli-rs/gimli/blob/master/src/read/unit.rs#L1231

philipc commented 4 years ago

This is a piece of code I came across: https://github.com/gimli-rs/gimli/blob/master/src/read/unit.rs#L1231

That code is in a macro. You need to look where the macro used, and it is only used for DW_AT_addr_base.

philipc commented 4 years ago

I had a look a the wasmtime PR, and I think your check for zero address base is wrong.

From 7.27 of the standard:

The DW_AT_addr_base attribute points to the first entry following the header. The entries are indexed sequentially from this base entry, starting from 0.

Since this is pointing past the header, it won't be zero. Note that originally some compilers emitted .debug_addr without the header, so it may be zero in that case.

ggreif commented 4 years ago

I had a look a the wasmtime PR, and I think your check for zero address base is wrong.

Yep, the PR still pretty much a draft, and I don't understand yet how the relative offset is communicated to the CU in order to correctly resolve DW_FORM_addrx and friends. I'd expect to set a member somewhere, but right now I have no way to dive into that. I have dumped the DWARF from a clang-10-generated object file and have seen 0 and 8 occurring with sec_offset, and I went with the first. Let me find it:

[1] DW_TAG_compile_unit DW_CHILDREN_yes
    DW_AT_producer  DW_FORM_strx1
    DW_AT_language  DW_FORM_data2
    DW_AT_name  DW_FORM_strx1
    DW_AT_str_offsets_base  DW_FORM_sec_offset  <<<<<<
    DW_AT_stmt_list DW_FORM_sec_offset  <<<<<<
    DW_AT_comp_dir  DW_FORM_strx1
    DW_AT_APPLE_optimized   DW_FORM_flag_present
    DW_AT_low_pc    DW_FORM_addrx
    DW_AT_high_pc   DW_FORM_data4
    DW_AT_addr_base DW_FORM_sec_offset  <<<<<<

the Unit is:

0x0000000c: DW_TAG_compile_unit
              DW_AT_producer    ("clang version 10.0.0 ")
              DW_AT_language    (DW_LANG_C99)
              DW_AT_name    ("/nix/store/4nanbdm73z391lqpfhhi3d18wyzqx4sg-source/bn_mp_exch.c")
              DW_AT_str_offsets_base    (0x00000008) <<<<<
              DW_AT_stmt_list   (0x00000000)  <<<<<<######!!!!!!!
              DW_AT_comp_dir    ("/Users/ggreif/motoko/rts")
              DW_AT_APPLE_optimized (true)
              DW_AT_low_pc  (0x0000000000000000)
              DW_AT_high_pc (0x000000000000004a)
              DW_AT_addr_base   (0x00000008)  <<<<<<

Indeed for the .debug_addr I see an offset 8:

.debug_addr contents:
0x00000000: Addr Section: length = 0x0000000c, version = 0x0005, addr_size = 0x08, seg_size = 0x00
Addrs: [
0x0000000000000000
]
philipc commented 4 years ago

That dump all looks good to me. .debug_str_offsets and .debug_addr use an 8 byte header, so 0x00000008 is after that header. DW_AT_stmt_list is different because it points to the header in .debug_line, not after it, so it will be 0 for the first unit.

philipc commented 4 years ago

Closing since I'm pretty sure there isn't a bug in gimli, but feel free to ask more questions or reopen if you have steps to reproduce a problem.

ggreif commented 4 years ago

I checked. Indeed gimli creates a AttributeValue::DebugLineRef when encountering DW_AT_stmt_list, which is processed by wasmtime at https://github.com/bytecodealliance/wasmtime/blob/master/crates/debug/src/transform/attr.rs#L91

So all is fine.