jam1garner / binrw

A Rust crate for helping parse and rebuild binary data using ✨macro magic✨.
https://binrw.rs
MIT License
539 stars 35 forks source link

Writing enums: pre_assert? #272

Open bilts opened 2 weeks ago

bilts commented 2 weeks ago

I have code like this:

#[derive(BinRead, BinWrite, Debug)]
#[brw(import(message_type: u8))]
enum MessageData {
  #[br(pre_assert(message_type ==  0x0u8))] Nil {
  },
  ...
}

bw / brw don't allow pre_assert so they can't serialize this. Is there a way to reverse parsing an enum like this? Is it feasible to implement pre_assert for brw?... I don't think using assert is correct.

csnover commented 1 week ago

Hi, thanks for your report!

There are a couple of things you could do now:

  1. Encapsulate the MessageData inside a struct that has the message_type as a field so they are both written out at the same time.
  2. Add #[bw(magic = 0u8)] to each variant to write the correct magic value for the variant, optionally with an Unknown(#[bw(calc(message_type))] u8) final variant.

Another option might be to map to a tuple at the top level but that would require #39 to be implemented.

I am not sure how to make this API less confusing. Are you passing in message_type instead of using magic because a single variant might be matched by multiple different magic values, or for performance, or for some other reason?

bilts commented 1 week ago

I'll try refactoring a bit based on the suggestions. I hadn't been using magic because there is tricky common header stuff that goes between message_type and the message (MessageData):

  1. message_type
  2. A u16 indicating how many bytes to read / pad from the message
  3. A u8 of flags. If bit 1 is set, then what gets read is a pointer to a message rather than a message itself, a completely different format
  4. Optionally, based on flags kept elsewhere, a sort index
  5. message

Items 2 and 3 make it hard to see how to parse with binrw just using magic.

It's this mess, for reference: https://docs.hdfgroup.org/hdf5/v1_14/_f_m_t3.html#V2ObjectHeaderPrefix

kitlith commented 1 week ago

Sounds like you want approach 1 that csnover laid out, then, and you're using pre_assert + import as you should be for that approach. You might want a helper method on MessageData to return the message_type for a given value. At this point you should not be thinking of serializing message_type as the responsibility of MessageData, but of whatever structure/set of functions that contains MessageData and is passing the message_type argument to MessageData. (in that vein, you should probably use #[br(import(...))] instead of #[brw(import(...))], because MessageData does not need external knowledge to know what it's message type is when writing.)

Good luck with the messy format! It's worth noting here that binrw has unfortunately not really gained good support for writing offsets yet, so that might be a bit of a struggle. (see #4)