eliben / pyelftools

Parsing ELF and DWARF in Python
Other
1.99k stars 507 forks source link

Parsing DWARF in .o files #564

Open sevaa opened 1 month ago

sevaa commented 1 month ago

.o files are superficially ELF (with e_type set to ET_REL), but unlike executables and shared libraries, they are allowed to contain multiple sections with the same name. With that in mind, the normal pyelftools' logic of pulling DWARF by finding the section with a given name breaks down.

Also, trying to load them with relocations enabled gives errors aplenty. Also because there can be multiple relocation sections with a given name and an arbitrary order; the first .rel.debug_info in a file doesn't have to correspond to the first .debug_info. In object files, sh_info of a rel/rela section is expected to contain the index of the section it applies to. In linked binaries, this doesn't hold.


I'm looking at pletoh.o within stm32wb_zigbee_wb_lib.a as downloaded from here. That file contains:

Regarding the way they are interlinked:


When it comes to dumping, readelf observes the section boundaries. So, for example, when dumping info, it goes:

Contents of the .debug_info section: Compilation Unit @ offset 0: ...the rest of the section dump, DIEs, attributes, etc. Contents of the .debug_info section: Compilation Unit @ offset 0: ...and then the stuff from the second section

for as many sections as there are. The section header doesn't indicate neither the offset nor the index of the section - its order in the binary is the only thing to go by.

This can be theoretically accommodated in the API by concatenating the info sections for parsing, but keeping section index somewhere in the CU data structure.


That all said, I don't know how prominent this file structure is in the grand scheme of things. Does GCC or LLVM emit objects like that? This is all the output of ARM's IAR compiler.