getsentry / pdb

A parser for Microsoft PDB (Program Database) debugging information
https://docs.rs/pdb/
Apache License 2.0
385 stars 69 forks source link

Use scroll_derive to generate read implementations for simple C structs #10

Closed luser closed 4 years ago

luser commented 7 years ago

While implementing module reading I noticed that there's some very repetitive code in pdb for reading simple C structs, like this example of reading the DBI header: https://github.com/willglynn/pdb/blob/b4cdee36f37e817b3a45001019f46924ccd3d7fb/src/dbi.rs#L168

It feels like we could use serde derive and write our own reader implementation to use with it to avoid listing all the struct fields out there.

willglynn commented 7 years ago

Sounds good to me. I wrote most of the existing code against Rust 1.14, which didn't support custom derives. No reason to avoid them today, though 😄

jrmuizel commented 6 years ago

https://github.com/m4b/scroll_derive does this kind of thing

jasonwhite commented 6 years ago

Wouldn't bincode be a perfect fit for this? It should serialize and deserialize the structs the same way it's currently being done. Anything that requires a complicated deserialization can be customized.

I believe it is also possible to have zero-copy deserialization with bincode now, but memory mapping the PDB seems unlikely at this point.

willglynn commented 6 years ago

Hmm… I'm slightly more comfortable with scroll + scroll_derive.

scroll lets us directly express "read at offset", while bincode reads from zero and expects you to make a sub-slice. That means within a specific context – say pdb::tpi – scroll's offsets could be always be absolute locations within a stream, whereas bincode would require us to remember an offset-offset.

There are enough offsets plumbed around (e.g. TypeFinder and supporting machinery) that I think "read at offset" will be a measurable win.

jrmuizel commented 6 years ago

Yeah, I would recommend against bincode. bincode's really only designed to deserialize things that have been serialized with bincode and not as a general purpose binary reader.

m4b commented 6 years ago

with scroll 0.8 you can now just add scroll to your Cargo.toml deps with feature = "derive", and simply do:

#[macro_use]
extern crate scroll;

instead of

extern crate scroll;
#[macro_use]
extern crate scroll_derive;

to get the derive impls (similar to failure). Let me know if you run into any issues!

It looks like i totally borked version 0.8 by having the derive feature depend on std and not be default, so you need to pass both feature flags, since as soon as you pass one feature flag, they're all disabled, which is annoying as hell. nevermind, ignore me right now ;)

jan-auer commented 4 years ago

We decided not to use scroll_derive in #70 since this significantly improves compilation times. However, scroll is now used in many cases for parsing structures and the remaining cases can be refactored subsequently.

I'm closing this issue to triage, but please feel free to reopen.

luser commented 4 years ago

I totally understand the compilation time argument but that's a bummer! I think custom derives make writing this kind of code so much less error-prone. I think TryFromCtx is probably simple enough that you could write a standard macro that wrapped struct definitions and generated the implementation alongside it if someone wanted to pursue that route.

jan-auer commented 4 years ago

That's right. Although, even with the derive, there were so many cases where we had to write custom TryFromCtx implementations since the fields are interleaved, have padding, etc.

It might be a nice experiment, but if the macro turns out too obscure, probably custom derives might be easier to understand and maintain. WDYT?