getsentry / pdb

A parser for Microsoft PDB (Program Database) debugging information
https://docs.rs/pdb/
Apache License 2.0
375 stars 67 forks source link

Implement Cross Module Imports #54

Closed jan-auer closed 4 years ago

jan-auer commented 4 years ago

This PR implements cross module imports (and exports).

If type indexes or id indexes can have their highest bit flipped (e.g. 0x8000000A). This indicates that they refer to an import in the DEBUG_S_CROSSSCOPEIMPORTS subsection in the C13 lines segment. One has to resolve that import, and look up the actual type or id index in the exports section of the referenced module. The full lookup procedure works as follows:

  1. The cross scope imports subsection is a table that specifies rows of modules, each with a list of imported local ids. An import of 0x8000000A actually means to look up module row 0x000, and then input column 0x0000A, i.e. the 11th import of the first module. This might yield something like: Module: 0x2FED, Local: 0x1132
  2. The module identifier is actually a reference into the string table where the name of the module is stored. One has to load this name, and then compare it to all module headers in the DBI stream to find the matching one. Important: This comparison must be performed case insensitive, since the case in the DBI stream and the name table often differ.
  3. The local index is the type index assigned by the compiler before linking the PDB together. The DEBUG_S_CROSSSCOPEEXPORTS subsection contains a mapping (tuples) of those local ids to the global ones used in the TPI and the IPI. This requires to load the respective module stream and locate the cross module exports subsection.
  4. When the compiler assigns the local ids, it uses a special encoding to differenciate TPI and IPI indices. Local ids with the highest bit set (e.g. 0x80001132) point into the IPI. However, this is not done for global ids, which have overlapping ranges. Therefore, during lookup in the exports subsection, one needs to check the high bit of the local id to infer what the global one points to.

When looking up a type index or an item index via cross module imports, one can assume that an imported item index will again resolve to an item index in the corresponding cross module section. That means, that for imported type indexes the local index will never have the high bit set, and for imported item indexes, the local index will always have the high bit set.

This PR exposes all interfaces that allow for performing such a cross module lookup. However, there is no higher abstraction since we cannot make an assumption how the DBI stream is iterated or how the user handles the string table.

jan-auer commented 4 years ago

@willglynn updated. CrossModuleExports in the public interface is now owned. I noticed that most of the times, one would want to hold on to the buffer of cross module exports independent of the entire module stream.

Also, all repr(packed) are now gone and replaced with a function that explicitly checks alignment at runtime. The PDB format is designed in a way where all data is aligned anyway, so we can start to verify that.

jan-auer commented 4 years ago

I've used it like this for a while now and it seems to work out quite well in terms of the API. Can definitely be improved, but rather than working off a branch we can improve this in a follow-up. Will merge this now once CI is green.