jam1garner / binrw

A Rust crate for helping parse and rebuild binary data using ✨macro magic✨.
https://binrw.rs
MIT License
620 stars 36 forks source link

Dealing with pre-mapped values in #[bw] #279

Closed scarletcafe closed 3 months ago

scarletcafe commented 3 months ago

I've run into a problem a few times while writing a protocol codec, and while I've been able to handle it via various workarounds, it's come up enough times that it seems like it would be worth bringing up to see if there could be a better solution for this.

Suppose I have code something like this:

pub enum Mode {
    Regular = 1,
    Extended = 2,
    Special = 3
}

impl Mode {
    // The actual implementations of this conversion are not important, but assume they are non-trivial, i.e., they cannot be replaced by a `#[brw(repr(u32))]` or so on, and that it would be burdensome to run them more than necessary.
    // In many cases, rather than u32, this would be some other struct or say a Vec, but it's u32 here for the sake of making the example easier to understand.
    pub fn from_u32(value: u32) -> Result<Self, _> { ... }
    pub fn to_u32(&self) -> Result<u32, _> { ... }
}

#[binrw] 
pub enum Packet {
    SomethingPacket {
        value: u8,
        #[br(try_map=|v: u32| Mode::from_u32(v))]
        #[bw(try_map=|v: &Mode| v.to_u32())]
        mode: Mode,
        #[brw(if(matches!(mode, Mode::Special)))]  // Here is where the issue is
        special_data: u8,
    },
    ... // Other packet types not included for brevity
}

This code, as written, doesn't work because while mode is Mode in the br half of the if, it's u32 in the bw half because of the mapping applied. Workaround ideas include:

Move the map implementations to Mode

If Mode were a struct, we might be able to do the following:

#[binrw]
#[br(try_map=|v: u32| Mode::from_u32(v))]
#[bw(try_map=|v: &Mode| v.to_u32())]
pub struct Mode {
    ...
}

However, when Mode is a unit enum it does not work because:

BinRead on unit-like enums requires either `#[br(repr = ...)]` on the enum or `#[br(magic = ...)]` on at least one variant
BinWrite on unit-like enums requires either `#[bw(repr = ...)]` on the enum or `#[bw(magic = ...)]` on at least one variant

Just convert it again

This approach, while it again does work, is sort of awkward and undesirable for a few reasons.

        #[br(if(matches!(mode, Mode::Special)))]
        #[bw(if(matches!(mode.from_u32().unwrap(), Mode::Special)))]
        special_data: u8,

Because, in our example, our from_u32 is non-trivial, we would prefer to not have to run it when serializing. In addition, while in theory the conversion should never fail (providing that from_u32 and to_u32 do a valid roundtrip), there is no try_if, so either we have to risk a panic as above, or do the following, which would instead suppress errors and potentially produce an invalid serialization.

        #[br(if(matches!(mode, Mode::Special)))]
        #[bw(if(matches!(mode.from_u32(), Ok(Mode::Special))))]
        special_data: u8,

Use self instead of the field

If packet were a struct, using self to get the field would make things easy. But because Packet is an enum, trying to match the mode field of SomethingPacket instead requires matching self in its entirety. It works, but it's kind of awkward.

        #[br(if(matches!(mode, Mode::Special)))]
        #[bw(if(matches!(self, Packet::SomethingPacket { mode: Mode::Special, .. })))]
        special_data: u8,

This approach is currently my preferred since it's the one that seems to have the least compromises and is most likely to get rustc to optimize it away.

Ideally there would be some easy way to access the pre-mapped fields here. There are a number of situations where you would want the post-mapped values also, so I'm not sure how the implemented version would look, but at the moment a lot of the workarounds seem to be less than ideal.

ckrenslehner commented 3 months ago

You can always decide to write your own parsers and writers. There you can both have the pre-mapped version and post-mapped version in separate variables.

I think this is the easiest way to get really customized parsing/writing

See https://docs.rs/binrw/latest/binrw/docs/attribute/index.html#custom-parserswriters

If you need some special argument from a parent structure you can simply add the argument to your function signature and pass it down via args

https://docs.rs/binrw/latest/binrw/docs/attribute/index.html#arguments