Open learn-more opened 3 years ago
Thanks for your interest in adding PE support! This is something I've wished for for a while.
Generally with Bloaty I have found that custom parsers are necessary. Bloaty cares about not only the data in the file, but the precise location of each bit of data in the file. For example, for the file headers and symbol table entries, we need to not only read them, but report their byte range within the file.
Generally I've found that existing libraries do not offer this information, because almost no program besides Bloaty needs it. For this reason, all of the existing parsers in Bloaty take the final approach you mentioned (grab the headers and write a complete custom parser). I expect PE will probably require the same.
@haberman now that the initial PR is merged, how do you want to proceed with PE support?
Now that we have the lit
testing in place, I'm a lot more comfortable moving forward with expanding PE support.
I'd love to see support for:
segments
: this would be the regions of the file that the loader will load. The segments
name is somewhat ELF-specific, but I think PE has something similar, like in the optional header?symbols
: using the symbol table hopefully we could get some good symbol support here.compileunits
: I assume PE files have this information available for debugging?What do you think?
Now that we have the
lit
testing in place, I'm a lot more comfortable moving forward with expanding PE support.I'd love to see support for:
segments
: this would be the regions of the file that the loader will load. Thesegments
name is somewhat ELF-specific, but I think PE has something similar, like in the optional header?symbols
: using the symbol table hopefully we could get some good symbol support here.compileunits
: I assume PE files have this information available for debugging?What do you think?
segments
seems to be very do-able, the PE header can be split in:
.text
, .rdata
etc section)As for symbols
: This is usually present in a PDB file, which at least yaml2obj does not support, and which would require another (extra) parser.
PE files with DWARF support should be do-able, but this are only gcc-built binaries, and those are not 'common' other than a few hobby projects.
compileunits
: I have no clue to be honest, but if this was present somewhere it would probably also be in the pdb file.
To add support for PE files there are a few different approaches that can be used:
What would be the preferred way of moving forward?