Open vvuk opened 6 months ago
DBIExtraStreams
from pdb.extra_streams()
is just full of None
here.
I have no idea where dia2dump is getting the RVA from above
https://github.com/getsentry/pdb/issues/17#issuecomment-2055784958
DIA uses the section map/OMF segment map (same thing, different names in different sources) to aid the translation. Section headers are not necessary to do the translation and this library simply doesn't implement the address translation this way.
Ah ha! I just made my way there, but was trying to figure out how to use that data. Sounds like I'm on the right track, at least for a limited use case.
Hmm, could maybe use another hint here @JustasMasiulis :) In this PDB, there isn't any omap data. So all I've got is the section_map.
DebugInformation { stream: Stream { source_view: ReadView(421 bytes) }, header:
DBIHeader { signature: 4294967295, version: V70,
age: 1, gs_symbols_stream: StreamIndex(8), internal_version: 36390,
ps_symbols_stream: StreamIndex(9), pdb_dll_build_version: 33135,
symbol_records_stream: StreamIndex(10), pdb_dll_rbld_version: 0,
module_list_size: 140, section_contribution_size: 88,
section_map_size: 84,
file_info_size: 20, type_server_map_size: 0, mfc_type_server_index: 0, debug_header_size: 0, ec_substream_size: 25, flags: 0, machine_type: 0, reserved: 0 }, header_len: 64 }
// debug_header_size is 0, but just in case:
DBIExtraStreams { fpo: StreamIndex(None), exception: StreamIndex(None), fixup: StreamIndex(None), omap_to_src: StreamIndex(None), omap_from_src: StreamIndex(None), section_headers: StreamIndex(None), token_rid_map: StreamIndex(None), xdata: StreamIndex(None), pdata: StreamIndex(None), framedata: StreamIndex(None), original_section_headers: StreamIndex(None) }
if I parse the section_map
as an OMFSegMapDesc
(roughly from microsoft-pdb), I get this:
sec_count: 4, sec_count_log: 4
OMFSegMapDesc { flags: 269, ovl: 0, group: 0, frame: 1, seg_name_index: 65535, class_name_index: 65535, offset: 0, size: 2576384 }
OMFSegMapDesc { flags: 269, ovl: 0, group: 0, frame: 2, seg_name_index: 65535, class_name_index: 65535, offset: 0, size: 212992 }
OMFSegMapDesc { flags: 269, ovl: 0, group: 0, frame: 3, seg_name_index: 65535, class_name_index: 65535, offset: 0, size: 16384 }
OMFSegMapDesc { flags: 520, ovl: 0, group: 0, frame: 0, seg_name_index: 65535, class_name_index: 65535, offset: 0, size: 4294967295 }
If I parse it as a DbiSectionMap from syzygy I get:
DBISectionMapItem { flags: 13, section_type: 1, unknown_data_1: 0, section_number: 1, unknown_data_2: 4294967295, rva_offset: 0, section_length: 2576384 }
DBISectionMapItem { flags: 13, section_type: 1, unknown_data_1: 0, section_number: 2, unknown_data_2: 4294967295, rva_offset: 0, section_length: 212992 }
DBISectionMapItem { flags: 13, section_type: 1, unknown_data_1: 0, section_number: 3, unknown_data_2: 4294967295, rva_offset: 0, section_length: 16384 }
DBISectionMapItem { flags: 8, section_type: 2, unknown_data_1: 0, section_number: 0, unknown_data_2: 4294967295, rva_offset: 0, section_length: 4294967295 }
DbiSectionMap packs flags/section_type into the 16-bit flags OMFSegMapDesc field, ok. But rva_offset
is still 0 here. What am I missing?
But
rva_offset
is still 0 here.
That is correct and this value is used as it is.
Hmm, could maybe use another hint here
For your specific PDB the segment frame is always 1, so there will be no section RVA "synthesis" (which is needed when there are no section headers) beyond adding 0x1000
(since there is no OMAP from) and your rva_offset
(which is 0) to the symbol.offset
For your specific PDB the segment frame is always 1,
Hm how do I know this? (and apologies, I'm still figuring out all the PDB details, so I'm not 100% familiar what the "segment frame" is -- equivalent to the section here? And thank you for your help!)
beyond adding 0x1000 (since there is no OMAP from) and your rva_offset (which is 0) to the symbol.offset Ok, so 0x1000 is assumed if there is no other information (+ the rva_offset from the section map)? What about the other two section map entries?
All the public symbols do fit within the first section's range, so moot point here, but e.g. where is e.g. 002AA000
coming from for the third entry in the contributions map?
I hacked in a version of this in the crate that turns out I'm actually using (so many pdbs) in samply; thanks for your help.
(Also to be clear, happy to do a PR for this upstream version of the crate as well if there's interest)
For your specific PDB the segment frame is always 1,
Hm how do I know this? (and apologies, I'm still figuring out all the PDB details, so I'm not 100% familiar what the "segment frame" is -- equivalent to the section here? And thank you for your help!)
OMFSegMapDesc.frame
from one of your previous samples. I wasn't clear about this, but I was looking only at symbols and their address translation.
beyond adding 0x1000 (since there is no OMAP from) and your rva_offset (which is 0) to the symbol.offset Ok, so 0x1000 is assumed if there is no other information (+ the rva_offset from the section map)? What about the other two section map entries?
All the public symbols do fit within the first section's range, so moot point here, but e.g. where is e.g.
002AA000
coming from for the third entry in the contributions map?
Both the second and third entries refer to frame > 1 and need extra work beyond just adding 0x1000
to synthesize. You need to add sum of sizes of preceding OMFSegMapDesc
entries.
I'm working to improve some profiling tools (
samply
specifically) that uses thepdb
create under the hood. Part of what I need is being able to handle symbols for .NET, specifically symbols from Crossgen2-built Ready 2 Run assemblies. I think that's where things are coming from anyway, anyway -- things that end in .ni.pdb, I think written here with some comments about DiaSymReader and other: https://github.com/dotnet/runtime/blob/fc76b1cac3f02cc9729f6682d6850fd7982e9fe5/src/coreclr/tools/aot/ILCompiler.Diagnostics/PdbWriter.cs#L199Here's an example of this type of PDB, from Microsoft's symbol server: dotnet.ni.dll Also just in case, this isn't a Portable PDB, it's a normal PDB, but I think written in a very limited way. It's just the symbol information.
When read by the
pdb
create, these pdbs show up as having no section information. Which means it can't get an address map, which means that I end up with no way of translating RVA addresses to symbols. Section contribution information is there though, e.g. here'sdia2dump -x
:These section contributions map directly to the 3 sections in the actual code
dotnet.dll
. I have no idea wheredia2dump
is getting theRVA
from above, as it's not in the section contrib information. I do see inPdbWriter.cs
some places where sections are written, but I have no idea where that info is going!In case it's useful, there's one module in this PDB (again from dia2dump):