Open jan-auer opened 2 years ago
The most straight-forward refactor here would be to make the section
field in SectionOffset
and PdbInternalSectionOffset
an Option<usize>
. However, a slightly nicer API could be wrapping the entire struct into an option at every usage site. This is a far more invasive refactor and slightly harder to use internally, though it could be more semantic since it clearly shows that the reference -- to whatever -- is empty.
Would someone have preferences on the API here? To sum it up:
// Option 1:
pub struct PdbInternalSectionOffset {
pub offset: u32,
pub section: Option<usize>,
}
pub struct LabelSymbol<'t> {
pub offset: PdbInternalSectionOffset,
pub flags: ProcedureFlags,
pub name: RawString<'t>,
}
// Option 2:
pub struct PdbInternalSectionOffset {
pub offset: u32,
pub section: usize,
}
pub struct LabelSymbol<'t> {
pub offset: Option<PdbInternalSectionOffset>,
pub flags: ProcedureFlags,
pub name: RawString<'t>,
}
A third option would be… Something that I explained today walking over our SymCache/SourceMapCache binary formats:
How about there wouldn’t be any kind of validation/transformation when reading these (internal)offsets from the file, but rather at the time you try to dereference them?
What does it mean if a PdbInternalSectionOffset
has section: None
? Does this mean "current section" or is it just impossible to dereference that offset?
How about there wouldn’t be any kind of validation/transformation when reading these (internal)offsets from the file, but rather at the time you try to dereference them?
That's the status quo. It can be error-prone at times, because from a u16
it's not clear whether it's zero or one indexed. This was also the primary motivation for https://github.com/willglynn/pdb/pull/93.
Point granted, our structs would be diverging from the physical representation with such a change -- this makes me lean towards Option 1, still. It's a good compromise between a safe interface and staying truthful to the PDB format.
What does it mean if a
PdbInternalSectionOffset
hassection: None
? Does this mean "current section" or is it just impossible to dereference that offset?
The latter, it means the reference points into the void and should not be dereferenced, from what I can see. The PDB code always subtracts 1 to go (in their language) from an Ximod to an imod. Then, they have #define imodNil ((USHORT)(-1))
and bail if they see this value. This strongly suggests that an Ximod of zero means "nil reference".
Neither option is particularly compelling, but I don't have a better suggestion either. I like/dislike both options about equally. I checked the pdb-addr2line code to see if either option would lead to simpler code, but I think it wouldn't make much of a difference. For example, first I thought that Option<PdbInternalSectionOffset>
would make it easier to filter out symbols I don't care about without reaching into their offset.section
property, but I need to access that property anyway because I want to check whether the section is executable (and filter out symbols for non-executable sections). And since I store PdbInternalSectionOffset
in a bunch of places, I thought it might be nice to have the PdbInternalSectionOffset
type guarantee that the section is non-nil, but this guarantee wouldn't give me much; for example, to_rva
conversions can still fail for other reasons.
https://github.com/willglynn/pdb/pull/93 improved module references by using optional 0-based
usize
indexes instead of requiring to subtract 1 to resolve modules. In https://github.com/willglynn/pdb/pull/93#issuecomment-1133669065, @mstange points out that same is true forPdbInternalSectionOffset::section
:The most common usage for section offsets so far was to convert them to RVAs via an OMAP. Internally, this ran the checked sub (going back to https://github.com/willglynn/pdb/issues/87):
https://github.com/willglynn/pdb/blob/e1b86a2c5c8e2c20dc0be9909baecaf1fdf0aa8a/src/omap.rs#L440-L441
However, since there are other direct uses of the section offset, it would be safer to make
section: Option<usize>
as well.