eliben / pyelftools

Parsing ELF and DWARF in Python
Other
1.97k stars 506 forks source link

I use pyelftools to get DW_AT_type of DW_TAG_variable, but sometimes the value is incorrect. #381

Open silverfoxhu opened 2 years ago

silverfoxhu commented 2 years ago

test.zip Version 0.27, Oct 27, 2020 I use this version to process dwarf of my elf. Use pyelftools I got this:

DIE DW_TAG_variable, size=25, has_children=False
    |DW_AT_name        :  AttributeValue(name='DW_AT_name', form='DW_FORM_string', value=b'__progname', raw_value=b'__progname', offset=699)
    |DW_AT_decl_file   :  AttributeValue(name='DW_AT_decl_file', form='DW_FORM_data1', value=1, raw_value=1, offset=710)
    |DW_AT_decl_line   :  AttributeValue(name='DW_AT_decl_line', form='DW_FORM_data1', value=31, raw_value=31, offset=711)
    |DW_AT_type        :  AttributeValue(name='DW_AT_type', form='DW_FORM_ref4', value=341, raw_value=341, offset=712)
    |DW_AT_external    :  AttributeValue(name='DW_AT_external', form='DW_FORM_flag', value=True, raw_value=1, offset=716)
|DW_AT_location    :  AttributeValue(name='DW_AT_location', form='DW_FORM_block1', value=[3, 48, 0, 0, 208], raw_value=[3, 48, 0, 0, 208], offset=717)

But when I use this command: tricore-objdump --dwarf=info test, I got this:

<1><2ba>: Abbrev Number: 6 (DW_TAG_variable)
    <2bb>   DW_AT_name        : __progname
    <2c6>   DW_AT_decl_file   : 1
    <2c7>   DW_AT_decl_line   : 31
    <2c8>   DW_AT_type        : <0x289>
    <2cc>   DW_AT_external    : 1
<2cd>   DW_AT_location    : 5 byte block: 3 30 0 0 d0       (DW_OP_addr: d0000030)

All of them are ok except DW_AT_type, the first one is 341(0x155), the second is 0x289.

Then, I use get_DIE_from_refaddr to get the DIE of data type. get_DIE_from_refaddr(0x289) is ok, but get_DIE_from_refaddr(0x155) would lead to an exception(KeyError: 50).

romkell commented 2 years ago

I encounter the same issue with a software compiled with armclang V6. I have another software (not the same) with armcc V5 and a third compiles with gcc V10 which I will also test.

I iterating over the CUs and dies and dumped all dies in a DW_AT_type.value -> die dict. I found that for at least the one mismatching I encountered so far, it still pointed between the one it very likely should have and the next dies offsets. As a workaround the correct value could be guessed by taking the next type defining tag. But it really is a workaround, which might go awfully wrong when I see the difference in the tricore example above (0x289 vs 0x155).

Not familiar with the code in detail, where would I have to look for that issue to happen and be fixed most likely. Can you point to a location / code area? Any hints?

romkell commented 2 years ago

The solution can be found in

descriptions.py

  def _describe_attr_ref(attr, die, section_offset):
      return '<0x%x>' % (attr.value + die.cu.cu_offset)

where the die.cu.cu_offset is added to attr.value.

The following code does not throw any exceptions any longer to me.

    # try to get the size from the referenced type
    if 'DW_AT_type' in die.attributes:
        try:
            refaddr = die.attributes['DW_AT_type'].value + die.cu.cu_offset
            type_die = dwarfinfo.get_DIE_from_refaddr(refaddr, die.cu)
        except Exception  as err:
            logging.error(" Trying to get ref die from '" + str(sym['name']) + "' with ref: '" + refaddr + "' caused an Exception")

        if 'DW_AT_byte_size' in type_die.attributes:
            sym['size'].append(type_die.attributes['DW_AT_byte_size'].value)

Hence:

It would really be nice if that was documented somewhat more obvious or with an example. I hope this comment helps others.

eliben commented 2 years ago

It would really be nice if that was documented somewhat more obvious or with an example.

Feel free to send a PR adding an example

taylorh140 commented 8 months ago

That was a key piece of information for me. maybe it would be a good thing to add to the key error message.

sevaa commented 4 months ago

@silverfoxhu is this still an issue?

taylorh140 commented 4 months ago

I think it could possibly be closed as long as the information is given enough emphasis in the documentation. Or part of error generated. It was a decent time sink for me.