Closed jan-auer closed 5 years ago
This PR now includes a breaking change: PDB::module_info
returns an Option, which is None if the module stream points to 0xffffffff
. This is generally treated as a marker for missing streams in PDBs.
I was thinking about introducing a wrapper type StreamNumber
that encapsulates this, but this would require more refactoring which exceeds the scope of this PR.
Thanks!
@willglynn Just don't release this right away please :) I'm planning of adding more features quite soon and they might have more breaking changes. Might make sense to batch them up for the next release.
I figured about as much :-)
This PR implements proper handling for C13 line and file information. Disclaimer: This will need some more testing with PDBs from different sources.
To give due credit, this PR is loosely based on initial work by @jrmuizel.
Backgrounds
Alright, so we've known for a while that module info streams ("modi") can store either C11 or C13 line information. LLVM claims they have only ever encountered C13 -- although that doesn't mean much since they are also not handling OMAPs so far. I did some digging and found that:
CV_SIGNATURE_C13 = 4
, which is the signature used for instance in the module information header.The implementation in this PR tries to cover C13 only, but prepares for the introduction of C11 code, if necessary -- or even a potential successor if that is ever the case.
C13 data in the module info stream consists of a variable number of sub streams of various kinds. Some of those only occur once, while others exist multiple times. To expose line information, only two of those needed to be read, but here's a slightly more complete picture:
DEBUG_S_FILECHKSMS
: Contains a list of files referenced by this module along with content checksums. Each entry also contains a reference to the file name in the string table.DEBUG_S_LINES
: Contains mappings of code offsets to line numbers, grouped by file (referring to the entries in the file checksums substream).DEBUG_S_SYMBOLS
: This would contain symbols. I have never seen this substream, however, as symbols were always declared separately before the line substreams. Also, theu32
that the symbol iterator skips at the beginning turns out to have the C13 signature value of4
.DEBUG_S_INLINEELINES
: Contains inlining information. I want to implement this soon in a follow-up PR, but that also requires to implement the IPI stream.DEBUG_S_STRINGTABLE
: A module-local version of the string table. This seems to be unused, since all recent PDBs only use the global string table stored in the/names
stream (often referred to as "Name Table" in the original code).Implementation
StringRef
type wraps offsets into the string table, andStringTable
allows to resolve it into a string value. This follows the exact same pattern as theAddressMap
in #17. There are also "convenience" methodsStringRef::to_string_lossy
andStringRef::to_raw_string
, which are not much more convenient but far more obvious than the getter onStringTable
.LineProgram
type returned byModuleInfo::line_program
. This should immediately be familiar to anyone who has worked with Dwarf. It exposes an iterator over line records, and has another function to resolve file names from file indices. It's up to the caller whether and when to look up names.Caveats
Line and column numbers seem to sometimes contain special values. I yet have to find out what they mean. Specifically, I have observed:
start_line
sometimes contains the marker values0xfeefee
(read: "do not step onto") or0xf00f00
(read: "do not step into"). I am not sure what to do with these lines: Either throw them away or mark them as special in theLineInfo
struct. Edit: They are special cased now, and intentionally hidden.end_line
is actually not a real value in the PDB. Instead, the PDB should contain a delta offset value to the start line. However, I've observed that this is not the case at all, instead it just contains the end line value truncated to 7 bits. This is now reflected in the computation of that value, but there needs to be some more testing around this.Some(0)
, which I believe can be treated asNone
. Again, this needs some more verification.Fixes #19