jam1garner / binrw

A Rust crate for helping parse and rebuild binary data using ✨macro magic✨.
https://binrw.rs
MIT License
620 stars 36 forks source link

Feature Request: Magic Numbers at end of File/Struct #274

Open apps4uco opened 4 months ago

apps4uco commented 4 months ago

Hi, some formats, eg Parquet have a Magic Number at the end of the file, also others have them at the end of headers eg Zstd Skippable Frames.

Although it is possible to define a struct like this

...
  #[brw(magic = 0x8F92EAB1_u32)]
  pub dummy: (),
}

however, it feels like a bit clunky.

Would it be possible to either support the following

...
#[brw(magic = 0x8F92EAB1_u32)]
}

or (maybe better for documentation purposes)

#[brw(end_magic = 0x184D2A5E_u32, little)]
pub struct MyStruct {

This would then allow a format to have start and end magic numbers.

#[brw(magic = 0x184D2A5E_u32,end_magic = 0x184D2A5E_u32, little)]
pub struct MyStruct {

Thanks for the excellent crate.

csnover commented 4 months ago

Hi, thanks for your report!

Putting an annotation right before the closing } of the struct is not possible since that is not valid Rust struct syntax. Adding another directive like end_magic (I would call that magic_after to match existing directives) would be possible but I don’t know how I feel about it—the benefit feels pretty marginal given the rarity of such types and the fact that a terminator field solves the problem whilst keeping things in order in the source code. The amount of manpower required to implement and maintain a new directive like this is not very big, but it is non-zero, and it feels like a thing where that time and energy could be put to use in better ways.

The only time I have encountered magic numbers like this is in cases where they are the end of a file and it sounds like it is the case here too. I am not sure if there is a confusion about zstd format or just a typo in this report, but just to be clear based on my reading of the spec, the only end-of-struct magic is the seek table footer at the end of the file, which is used to identify a zstd file with a seek table; magic in skippable frames is the first field of the frame header so is already fully supported.

If it feels clunky mainly because then you have a dummy field you have to fill out at runtime, you can mark it as a temporary field so it doesn’t actually make it to the final emitted struct. Otherwise, this asymmetry for end-of-struct stuff exists in other areas too (e.g. skipping over the end of a variably-sized struct) and my experience has been that using a terminator field feels bad until you just end up deciding on some convention like always using the field _end: () or __: () and then it is, like, fine.