sharksforarms / deku

Declarative binary reading and writing: bit-level, symmetric, serialization/deserialization
Apache License 2.0
1.14k stars 55 forks source link

Right way to imply a checksum #302

Open shirok1 opened 1 year ago

shirok1 commented 1 year ago

Hello, I've been porting a network protocol to Rust using deku recently, replacing handwritten macros. In this protocol, there are two checksum fields, one is a CRC16 for header and one is a CRC32 in the tail. I tried using custom writer with temp tag, result in the field being completely ignored during writing. I also tried using bits = "16" or pad_bits_after = "16" with a () type, but the corresponding bytes are not jumped. Is there a best practice to imply such thing?

sharksforarms commented 1 year ago

Are you able to provide an example of what you have and what you expect? I don't want to make assumptions on your format :)

shirok1 commented 1 year ago

Actually the protocol definition is available on GitHub: https://github.com/Livox-SDK/Livox-SDK/wiki/Livox-SDK-Communication-Protocol#21-frame-format

My current code is like this:

#[deku_derive(DekuRead, DekuWrite)]
#[derive(Debug)]
#[deku(magic = b"\xAA")]
pub struct DekuControlFrame {
    // ///    Starting Byte, Fixed to be 0xAA
    // #[deku(temp)]
    // #[deku(assert_eq = "0xAA")]
    // magic: u8,

    /// Protocol Version, 1 for The Current Version
    pub version: u8,

    /// Length of The Frame, Max Value: 1400
    // #[deku(temp)]
    length: u16,

    /// Command Type:
    /// 0x00: CMD
    /// 0x01: ACK
    /// 0x02: MSG
    // #[deku(temp)]
    cmd_type: u8,

    /// Frame Sequence Number
    pub seq_num: u16,

    /// Frame Header Checksum
    // #[deku(temp)]
    #[deku(
    // bits = "16",
    writer = "checksum_write_16(deku::output)"
    )]
    crc_16: u16,

    /// Payload Data
    #[deku(ctx = "*cmd_type")]
    // #[deku(count = "cmd_type")]
    pub data: FrameData,

    /// Whole Frame Checksum
    // #[deku(temp)]
    #[deku(
    // bits = "32",
    writer = "checksum_write_32(deku::output)"
    )]
    crc_32: u32,
}

fn checksum_write_16(
    output: &mut BitVec<u8, Msb0>,
) -> Result<(), DekuError> {
    // writeln!()
    // dbg!(&output);
    CRC16.checksum(
        output.as_raw_slice()
    ).write(output, ())
}

fn checksum_write_32(
    output: &mut BitVec<u8, Msb0>,
) -> Result<(), DekuError> {
    CRC32.checksum(
        output.as_raw_slice()
    ).write(output, ())
}
sharksforarms commented 1 year ago

This looks okay to me...

but the corresponding bytes are not jumped

what do you mean by this?

Here's an example, it may help you. If you can provide a minimal reproduction that I can run I may be able to help debug.

use deku::bitvec::{BitVec, Msb0};
use deku::prelude::*;
use std::convert::TryInto;

#[derive(Debug, PartialEq, DekuRead, DekuWrite)]
struct DekuTest {
    a: u8,
    b: u8,

    #[deku(writer = "checksum1(deku::output)")]
    sum1: u16,

    #[deku(count = "2")]
    data: Vec<u8>,

    #[deku(writer = "checksum2(deku::output)")]
    sum2: u32,
}

fn checksum1(output: &mut BitVec<u8, Msb0>) -> Result<(), DekuError> {
    let sum: u16 = output.as_raw_slice().iter().map(|v| *v as u16).sum();
    sum.write(output, ())
}

fn checksum2(output: &mut BitVec<u8, Msb0>) -> Result<(), DekuError> {
    let sum: u32 = output.as_raw_slice().iter().map(|v| *v as u32).sum();
    sum.write(output, ())
}

fn main() {
    let test_data: &[u8] = [
        1, 2, // a and b as u8
        3, 0, // sum1 as u16 little endian (1+2 == 3)
        1, 2, // data as u8
        9, 0, 0, 0 // sum2 as u32 little endian (1+2+3+1+2 == 9)
    ].as_ref();

    let (_rest, ret_read) = DekuTest::from_bytes((test_data, 0)).unwrap();

    assert_eq!(
        ret_read,
        DekuTest {
            a: 1,
            b: 2,
            sum1: 3,
            data: vec![1, 2],
            sum2: 9
        }
    );

    let ret_write: Vec<u8> = ret_read.try_into().unwrap();
    assert_eq!(test_data.to_vec(), ret_write);
}
shirok1 commented 1 year ago

Well, I want to hide the checksum or make it unchangeable (by using () unit type as field type) to point out that checksum should be automatically generated, while making every field in this struct public (mainly for struct initializer)

sharksforarms commented 1 year ago

Ah interesting use case! Thanks for explaining furthur.

I created a branch with some changes to be able to use temp attribute with a custom writer

You can try it out by changing the dependency:

deku = { git = "https://github.com/sharksforarms/deku.git", branch = "sharksforarms/temp-write" }

Here is the new example I came up with. Let me know if this matches your use-case.

use deku::bitvec::{BitVec, Msb0};
use deku::prelude::*;
use std::convert::TryInto;

#[deku_derive(DekuRead, DekuWrite)]
#[derive(Debug, PartialEq)]
struct DekuTest {
    a: u8,
    b: u8,

    #[deku(writer = "checksum1(deku::output)")]
    #[deku(temp)]
    sum1: (),

    #[deku(count = "2")]
    data: Vec<u8>,

    #[deku(writer = "checksum2(deku::output)")]
    #[deku(temp)]
    sum2: (),
}

fn checksum1(output: &mut BitVec<u8, Msb0>) -> Result<(), DekuError> {
    let sum: u16 = output.as_raw_slice().iter().map(|v| *v as u16).sum();
    sum.write(output, ())
}

fn checksum2(output: &mut BitVec<u8, Msb0>) -> Result<(), DekuError> {
    let sum: u32 = output.as_raw_slice().iter().map(|v| *v as u32).sum();
    sum.write(output, ())
}

fn main() {
    let test_data_read: &[u8] = [
        1, 2, // a and b as u8
        1, 2, // data as u8
    ]
    .as_ref();

    let test_data_write: &[u8] = [
        1, 2, // a and b as u8
        3, 0, // sum1 as u16 little endian (1+2 == 3)
        1, 2, // data as u8
        9, 0, 0, 0, // sum2 as u32 little endian (1+2+3+1+2 == 9)
    ]
    .as_ref();

    let (_rest, ret_read) = DekuTest::from_bytes((test_data_read, 0)).unwrap();

    assert_eq!(
        ret_read,
        DekuTest {
            a: 1,
            b: 2,
            data: vec![1, 2],
        }
    );

    let ret_write: Vec<u8> = ret_read.try_into().unwrap();
    assert_eq!(test_data_write.to_vec(), ret_write);
}
shirok1 commented 1 year ago

Wow, thanks for the amazing job! It works just like how I imagined.

By the way, is it possible to make context parameter tempable too? If so, I can make my cmd_type field #[deku(temp)] too, which is used as an ID in FrameData enum of [CMD, ACK, MSG].

PPS. What about the length field, which presents the whole length of the encoded bytes, can it be implied using deku too? I currently directly write it into the buffer after the encode process.

sharksforarms commented 1 year ago

Wow, thanks for the amazing job! It works just like how I imagined.

By the way, is it possible to make context parameter tempable too? If so, I can make my cmd_type field #[deku(temp)] too, which is used as an ID in FrameData enum of [CMD, ACK, MSG].

PPS. What about the length field, which presents the whole length of the encoded bytes, can it be implied using deku too? I currently directly write it into the buffer after the encode process.

The thing with temp is that we remove the field entirely from the AST at compilation. That means we cannot store the result, such as cmd_type and length. Maybe the update attribute better suites your usecase for these?

shirok1 commented 1 year ago

Hello, I wonder if sharksforarms/temp-write is going to be merged to main?

wcampbell0x2a commented 1 year ago

Hello, I wonder if sharksforarms/temp-write is going to be merged to main?

Did this MR fix your issues? https://github.com/sharksforarms/deku/pull/322

shirok1 commented 1 year ago

Sorry, I haven't been around this project until now :cry:

And I found a bug when using it with context(#[deku(ctx}]). In the generated DekuContainerWrite::to_bits, the rest properties are references and passed to DekuWrite::write as references, temp properties are values, and use & inline in DekuWrite::write's arg list. However, ctx won't distinguish them, causing either a "type XXX cannot be dereferenced" (if writing * in ctx) or "mismatched types, expected XXX, found &XXX" (if not).

My code:

#[deku(temp, temp_value = "self.data.deku_id()?")]
pub cmd_id: u16,

#[deku(ctx = "*cmd_id, *cmd_type")]
pub data: command::Data,

Expands to:

image

While I could simply change type in ctx to reference, I don't think a reference to such a small number is something elegant... :sleeping:

sharksforarms commented 1 year ago

Would this be a minimal reproduction?

use deku::prelude::*;
use std::convert::TryInto;

#[deku_derive(DekuRead, DekuWrite)]
#[derive(Debug, PartialEq)]
#[deku(ctx = "field1: u8, field2: u8")]
struct Child {
}

#[deku_derive(DekuRead, DekuWrite)]
#[derive(Debug, PartialEq)]
struct Parent {
    pub field1: u8,

    #[deku(temp, temp_value = "self.field1")]
    pub field2: u8,

    #[deku(ctx = "*field1, *field2")]
    pub field3: Child
}

fn main() {
    let value = Parent {
        field1: 0x01,
        field3: Child {}
    };
    let value: Vec<u8> = value.try_into().unwrap();
    assert_eq!(vec![0x01], value);
}

I don't have time to work on this at the moment, cc: @SneakyBerry maybe they have ideas

SneakyBerry commented 1 year ago

@sharksforarms thanks for mentioning and minimal reproduction. I'll do some research and provide MR.

shirok1 commented 1 year ago

@SneakyBerry Hello, have you get any progress there?

SneakyBerry commented 1 year ago

@shirok1 Hi! Check this. Will it work for you?

zhang-wenchao commented 4 days ago

I faced the same issue - it's a very common problem. The packet has a start bit, followed by data, CRC, and an end bit. Having to strip out start/end bits and CRC, extract the data, and then verify it is quite cumbersome.