getsentry / pdb

A parser for Microsoft PDB (Program Database) debugging information
https://docs.rs/pdb/
Apache License 2.0
368 stars 68 forks source link

Parse module references in symbol data #93

Closed jan-auer closed 2 years ago

jan-auer commented 3 years ago

This is an attempt to provide a better API for module references in symbol data. My initial findings were described in https://github.com/willglynn/pdb/issues/89#issuecomment-719528757:

  • Code for resolving a module reference can be seen here.
  • This subtracts 1 via imodForXimod to get a 0-based index. I'm assuming a literal 0 means no reference.
  • openModByImod then runs pmodiForImod.
  • pmodiForImod indexes into rgpmodi.
  • rgpmodi is built sequentially here.

So I think it should be safe to assume that you can take dbi.modules().nth(proc_ref.module - 1).

In this implementation, I'm taking the safe route and parsing this into an Option<usize> instead of erroring, but I'm not sure if that's even necessary. Most likely, a value of 0 is invalid an a reference symbol, so we could also error out instead of making this an Option.

Can someone validate my assumptions?

jan-auer commented 3 years ago

@willglynn since there are breaking changes lined up the next release on master, what do you think about this PR?

jan-auer commented 2 years ago

I'll start looking into this again ahead of another "major" release.

@Swatinem @gabrielesvelto @mstange -- do you have thoughts on this change, since this is being used in some of your projects.

mstange commented 2 years ago

I checked pdb-addr2line to see what it's using module indexes for, and found the following instances:

(I'm actually a bit surprised by these findings - I distinctly recall that I had a bug because I forgot to subtract 1 in some place, but maybe it was in code which I ended up removing.) Edit: I found the thing I remembered, it was a section index, not a module index. sections.get((section_index - 1) as usize)

So it seems that pdb-addr2line is not making use of any of the module_index properties that this PR modifies. Maybe it should be using some of those to get better results? For example, ProcedureReferenceSymbol seems like something that pdb-addr2line might be interested in.

I think this PR looks good. The only suggestion I have is that maybe DBISectionContribution::module should also be changed to a usize for consistency - it's currently a u16 which doesn't make it clear whether you need to subtract 1 or not.

gabrielesvelto commented 2 years ago

I don't think we have other dependencies on this API. dump_syms uses the pdb crate but never uses this particular field.