jam1garner / binrw

A Rust crate for helping parse and rebuild binary data using ✨macro magic✨.
https://binrw.rs
MIT License
545 stars 35 forks source link

modular_bitfield attribute ignores endianness for types like u16 #255

Closed marxin closed 2 months ago

marxin commented 2 months ago

I noticed that while parsing a DNS message header:

#[derive(BitfieldSpecifier, Debug)]
#[bits = 1]
enum QueryResponseIndicator {
    Query,
    Response,
}

#[derive(BitfieldSpecifier, Debug)]
#[bits = 1]
enum AuthoritativeAnswer {
    Other,
    Owns,
}

#[derive(BitfieldSpecifier, Debug)]
#[bits = 1]
enum Truncation {
    No,
    LargerThan512,
}

#[derive(BitfieldSpecifier, Debug)]
#[bits = 1]
enum RecursionDesired {
    NonRecursive,
    Recursive,
}

#[derive(BitfieldSpecifier, Debug)]
#[bits = 1]
enum RecursionAvailable {
    NotAvailable,
    Available,
}

#[bitfield]
#[derive(BinRead, Debug)]
#[br(map = Self::from_bytes)]
pub struct DnsHeader {
    /// Packet Identifier (ID)A random ID assigned to query packets. Response packets must reply with the same ID.
    pub id: u16,

    /// Recursion Desired (RD) Sender sets this to 1 if the server should recursively resolve this query, 0 otherwise.
    pub recursion_desired: RecursionDesired,
    /// Truncation (TC) 1 if the message is larger than 512 bytes. Always 0 in UDP responses.
    pub truncation: Truncation,
    /// Authoritative Answer (AA) 1 if the responding server "owns" the domain queried, i.e., it's authoritative.
    pub authoritative: AuthoritativeAnswer,
    /// Operation Code (OPCODE) Specifies the kind of query in a message.
    pub opcode: B4,
    /// Query/Response Indicator (QR). 1 for a reply packet, 0 for a question packet.
    pub qr: QueryResponseIndicator,

    /// Response Code (RCODE) Response code indicating the status of the response.
    pub response: B4,
    /// Reserved (Z) Used by DNSSEC queries. At inception, it was reserved for future use.
    #[skip]
    __: B3,
    /// Recursion Available (RA) Server sets this to 1 to indicate that recursion is available.
    pub recursion_available: RecursionAvailable,

    /// Question Count (QDCOUNT) Number of questions in the Question section.
    pub question_count: u16,

    /// Answer Record Count (ANCOUNT) Number of records in the Answer section.
    pub answer_count: u16,

    /// Authority Record Count (NSCOUNT)    16 bits Number of records in the Authority section.
    pub nscount: u16,

    /// Additional Record Count (ARCOUNT)   16 bits Number of records in the Additional section.
    pub arcount: u16,
}

It seems the endianess of the natural multi-byte integral types is ignored:

#[bitfield]
#[derive(BinRead, Debug)]
#[br(map = Self::from_bytes)]
pub struct StructWithBitfield {
    pub n: u16,
}

#[derive(BinRead, Debug)]
pub struct Struct {
    pub n: u16,
}

#[test]
    fn bitfield_integer_types_endianess() {
        let blob = b"\x00\x01";
        let be1 = Cursor::new(blob).read_be::<Struct>().unwrap();
        let be2 = Cursor::new(blob).read_be::<StructWithBitfield>().unwrap();

        let le1 = Cursor::new(blob).read_le::<Struct>().unwrap();
        let le2 = Cursor::new(blob).read_le::<StructWithBitfield>().unwrap();
        println!("big endian: {be1:?} {be2:?}");
        println!("little endian: {le1:?} {le2:?}");
        assert_eq!(be1.n, 1);
        assert_eq!(le1.n, 256);
        assert_eq!(le2.n(), 256);

        assert_eq!(be2.n(), 1);
    }

Fails with:

---- tests::bitfield_integer_types_endianess stdout ----
big endian: Struct { n: 1 } StructWithBitfield { n: 256 }
little endian: Struct { n: 256 } StructWithBitfield { n: 256 }
thread 'tests::bitfield_integer_types_endianess' panicked at src/lib.rs:71:9:
assertion `left == right` failed
  left: 256
 right: 1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Apparently modular_bitfield does not have any handling of endianness and that's why I face the issue. Can you please do something about it? Thanks for help!

kitlith commented 2 months ago

This is not a bug in binrw. modular_bitfield, which is processing the struct first, transforms the struct to using an array of bytes internally. Once it's an array of bytes, any endianess information is opaque to binrw, and cannot be acted upon.

Your available options:

  1. Write a custom binread/binwrite implementation that does the right thing.
  2. Use a different bitfield crate that produces bits/bytes in the right order for your usecase.
    • There is a reason why there are multiple forks of modular-bitfield in particular available on crates.io
  3. Something in-between.
    • For instance, someone could write a binread/binwrite implementation for the types in the arbitrary_int crate, allowing you to use bilge or bitbybit.
marxin commented 2 months ago

Thanks for the clarification. Yes, I understand it's related more to modular_bitfield. Anyway, my solution is that I encapsulated the bitfields into structs that don't contain any basic integral types (u16, ...). Would you appreciate if I made a PR where I would document the current behavior when it comes to multi-byte integral types within #[bitfield] struct?