Open altendky opened 10 years ago
I don't see your input file. Can you provide it and say explicitly where the mismatch is? Or can this be reproduced on one of my samples? If yes, can you specify exactly the steps?
Thanks for the reply and my apologies for the confusion. While I originally observed this on my own .ELF, the results I posted were from sample_exe64.elf.
cd pyelftools/test/testfiles_for_unittests/
curl <URL no longer valid > altendky.py
python3 altendky.py sample_exe64.elf | grep -B 1 -A 6 'Name: size_t'
objdump -W sample_exe64.elf | grep -B 1 -A 3 ': size_t'
(sorry, but that pasted code is lost and I no longer have the original...)
My results are:
$ python3 altendky.py sample_exe64.elf | grep -B 1 -A 6 'Name: size_t'
DIE tag=DW_TAG_typedef
Name: size_t
Offset: 470
File: 2
Line: 214
Type: 63
Attributes: OrderedDict([('DW_AT_name', AttributeValue(name='DW_AT_name', form='DW_FORM_strp', value=b'size_t', raw_value=205, offset=471)), ('DW_AT_decl_file', AttributeValue(name='DW_AT_decl_file', form='DW_FORM_data1', value=2, raw_value=2, offset=475)), ('DW_AT_decl_line', AttributeValue(name='DW_AT_decl_line', form='DW_FORM_data1', value=214, raw_value=214, offset=476)), ('DW_AT_type', AttributeValue(name='DW_AT_type', form='DW_FORM_ref4', value=63, raw_value=63, offset=477))])
DIE: ['__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_children', '_parent', '_parse_DIE', '_translate_attr_value', 'abbrev_code', 'add_child', 'attributes', 'cu', 'dwarfinfo', 'get_full_path', 'get_parent', 'has_children', 'is_null', 'iter_children', 'iter_siblings', 'offset', 'set_parent', 'size', 'stream', 'tag']
$ objdump -W sample_exe64.elf | grep -B 1 -A 3 ': size_t'
<1><1d6>: Abbrev Number: 3 (DW_TAG_typedef)
<1d7> DW_AT_name : (indirect string, offset: 0xcd): size_t
<1db> DW_AT_decl_file : 2
<1dc> DW_AT_decl_line : 214
<1dd> DW_AT_type : <0x1e1>
I expect my scripts 'Type:' value (and the 'DW_AT_type' value entry in the attributes) to match that of objdump's 'DW_AT_type' value (of course, accounting for the hex vs. decimal formatting difference).
Thanks again for your interest and hopefully this makes it reasonably straightforward for you to observe the difference I do.
Cheers, -kyle
@altendky thanks for the details. It may take some time for me to get to look at this issue, but when I do the extra details certainly help.
Unfortunately I don't have much time to fix these issues right now; if you could create a pull request with a fix, that would certainly make things easier.
I certainly understand that you would have other things to do but appreciate your interest. Honestly, I haven't even gotten back to the task where I was applying this at work. That said, I may be able to get into debugging today. We'll see if I can make it anywhere.
Wrong button :[ sorry.
0x1e1
(which objdump reports) is the stream offset (I think) while 0x3f
(reported by pyelftools) is the offset from the beginning of the compilation unit which starts at 0x1a2
. The form is being reported within pyelftools as DW_FORM_ref4
. See the DWARF 2.0 standard page 69. Note that my commit covers DW_FORM_ref[1-4]
but not DW_FORM_ref_udata
or DW_FORM_ref_addr
.
I will test this out in my application before submitting a pull request.
@altendky Does my issue #113 shed any light on your situation? I'm successfully getting the underlying type of a DW_TAG_typedef
using the code shown there.
I have a new job since this so I don't have the original file and wherever I pasted it lost it so I can't test it quickly myself. But, at some point I will have a similar task in my new position. I'm not sure if we can get ELF files for our embedded code or not, but if we can I may come back to this.
Just looking over the code snippets and trying to refresh myself, my commit 2a195dccd7c9e0458d82843780e8e71157763658 still seems relevant since without it it seems an incorrect value would be returned for DW_FORM_ref[1-4]
. Well, unless my patch was straight-up incorrect to begin with, though per the commit title it seemed to work.
Your code certainly may be good as well :] but it takes a bigger picture understanding than I presently have in my head to judge.
Regardless, thanks for the followup.
@JonathonReinhart Also, note that I was specifically having trouble with unnamed structures.
typedef struct {
int myMember;
} MyStruct;
As opposed to named structures with a typedef.
struct MyStruct {
int myMember;
};
typedef struct MyStruct MyStruct;
I didn't see any reference in your issue so I'm not sure which case you are working with.
Well, here I am back on this task in my new job. First I will note that this is reproducible with the still active 'pastebin' link (http://tny.cz/c7174417). Also 'backed up' at https://gist.github.com/d92fb39a86bd278442f5933f04b540dd.
But!
It would seem that I was misusing the library. The value does need the offset applied as in 2a195dccd7c9e0458d82843780e8e71157763658 to be useulf but this seems to be expected to be handled elsewhere in pyelftools
. For textual output describe_attr_value()
is provided and does do the translation.
DIE tag=DW_TAG_typedef
Name: size_t
Offset: 470
Line: 214
Type: 63
describe_attr_value(Type): <0x1e1>
I hope my misunderstanding didn't waste too much of anyone's time over the past couple years...
I'm going to leave that judgement to someone else because it looks like .value
vs. .raw_value
may be relevant and it probably should be interpreted immediately.
# value:
# The value parsed from the section and translated accordingly to the form
# (e.g. for a DW_FORM_strp it's the actual string taken from the string table)
#
# raw_value:
# Raw value as parsed from the section - used for debugging and presentation
# (e.g. for a DW_FORM_strp it's the raw string offset into the table)
@eliben, if you feel that this should be changed I can take a look at making the various other adjustments to 'fix the tests' (use .raw_value
instead of .value
or don't further translate it).
I am trying to use pyelftools to parse both TriCore and C166 ELF files (just TriCore for now) to get a list of variables and addresses including being aware of structure members. The application will be for a remote watch window for the embedded target. I have fiddled with the dwarf_die_tree.py example but ran into an issue where I am unable to connect between typedef’s and the unnamed structure definitions they reference. Or so it seems to my DWARF-ignorant brain (in case my previous comments did not in some way make that obvious).
To avoid any issues associated with my particular architecture ELF I also tried my script against test/testfiles_for_unittests/sample_exe64.elf and observed the same thing. When I run objdump -W as a reference I get, amongst other things:
My pyelftools script results in (again, just a snippet):
What seems to me to be an issue is that objdump shows DW_AT_type as 0x1e1 (481) as opposed to pyelftools which returns 63 (0x3f). The other values I have compared seem to correspond. Is this a simple lack of understanding of DWARF or a misuse of pyelftools, or is there an issue here? I started to dig in the code a bit but with my limited knowledge I didn’t find anything that looking glaringly wrong.
Here’s my system info (Python3 within Cygwin64 within Win7 64):
Thank you for any time you choose to spend helping me. -kyle