getsentry / pdb

A parser for Microsoft PDB (Program Database) debugging information
https://docs.rs/pdb/
Apache License 2.0
375 stars 67 forks source link

Implement FPO and FrameData handling #43

Closed jan-auer closed 5 years ago

jan-auer commented 5 years ago

This PR implements handling for the legacy FPO debug stream and the newer FrameData version. NOTE: Based on #42 and will be rebased once that is merged.

Generally, this information is only included in PDBs for 32-bit executables. On x64, unwind information is only placed in the PE file. I yet have to perform a 32-bit build of foo.exe to check in a test case.

There have been two revisions of this data:

After doing some tests, it seems that IDiaEnumFrameData, which exposes this information in the DIA SDK, first traverses the new FrameData stream and legacy FPO data. I have encountered at least one PDB where both of these streams were present. In contrast to the DIA implementation, I chose to deduplicate these entries and provide one iterator that guarantees RVA order. This comes in particularly handy, when doing lookups or when collecting this information into an ordered table.

In addition to the optional FRAMEDATA stream, this information could also occur as a C13 module substream. However, I've not seen this data in practice yet. Until there's a clear story for how to expose these substreams (currently they are hidden in the private modi::c13 module), there's probably no need to separate that out.

jan-auer commented 5 years ago

In addition to a test, I also need to verify what happens with rearranged PDBs that went through Project Vulcan. I suspect that the RVAs we're seeing in the frame data are actually PdbInternalRva (thus requiring an AddressMap for translation).

I found that 32-bit kernel PDBs (e.g. this one) at least contain FrameData and FPO data streams, but haven't found one that also has an OMAP.

jan-auer commented 5 years ago

The RVAs are indeed from the original address space (i.e. PDB internal). The PR is now updated to reflect that.

Since it was not possible to efficiently traverse the OMAP, I've also added AddressMap::rva_ranges and AddressMap::internal_rva_ranges. I was thinking about using Range or even R: RangeBounds in those signatures for a while, but came back to the simpler tuple approach in the end.

jan-auer commented 5 years ago

@willglynn Friendly ping :) It would be great to get a review of this and #42 going. I'm planning to push out a new release of symbolic soon which will need this functionality released as well. Thanks!

willglynn commented 5 years ago

I've been traveling but will take a look tomorrow.

jan-auer commented 5 years ago

Rebased; the last two commits contain updates after the initial review.

I also figured that it would make sense to introduce an Offset(pub u32) newtype. Since this PR is already quite large, that should maybe go in a separate PR, however.

jan-auer commented 5 years ago

I think I'm done with changes for now :) Could you do a release with this?

willglynn commented 5 years ago

Done:

pdb = "0.4.0"

Thanks!