bincode-org / bincode

A binary encoder / decoder implementation in Rust.
MIT License
2.63k stars 265 forks source link

Support retrieving the bytes written from encode #703

Closed Sammo98 closed 5 months ago

Sammo98 commented 5 months ago

Am writing an encoder for RabbitMQ, one of the implementations (trivalised) looks like this:

impl Encode for Table {
    fn encode<E: bincode::enc::Encoder>(
        &self,
        encoder: &mut E,
    ) -> Result<(), bincode::error::EncodeError> {
        for (key, value) in self.clone().iter() {
            String::encode(key, encoder)?;
            match value {
                Field::SS(s) => {
                    char::encode(&'s', encoder)?;
                    String::encode(s, encoder)?;
                }
                Field::T(t) => {
                    char::encode(&'s', encoder)?;
                    Table::encode(t, encoder)?;
                }
            }
        }
        Ok(())
    }
}

However before encoding the keys and values, I need to encode the total number of bytes that the Table will be.

Apologies is there is already a way to do this!

VictorKoenders commented 5 months ago

You can try using the https://docs.rs/bincode/2.0.0-rc.3/bincode/enc/write/struct.SizeWriter.html

I don't see a better way of doing this in bincode. However that means that most of your data structures will be iterated over multiple times.

Sammo98 commented 5 months ago

@VictorKoenders Thanks for the pointer, I gave that a go but felt it didn't quite fit my use case, although ultimately that's on me for trying to force bincode to suit my needs.

I ended up going with a more manual parsing approach (bit hacky and WIP for now)

impl Table {
    fn to_bytes(&self) -> Vec<u8> {
        let mut bytes: Vec<u8> = Vec::new();
        for (key, value) in self.clone().iter() {
            bytes.push(key.len() as u8);
            bytes.extend_from_slice(key.as_bytes());

            match value {
                Field::SS(s) => {
                    bytes.push('s' as u8);
                    bytes.push(s.len() as u8);
                    bytes.extend_from_slice(s.as_bytes());
                }
                Field::T(t) => {
                    bytes.push('F' as u8);
                    bytes.extend_from_slice(&t.to_bytes());
                }
            }
        }
        let mut length_bytes = (bytes.len() as u32).to_be_bytes().to_vec();
        length_bytes.extend_from_slice(&bytes);
        length_bytes
    }
}

impl Encode for Table {
    fn encode<E: bincode::enc::Encoder>(
        &self,
        encoder: &mut E,
    ) -> Result<(), bincode::error::EncodeError> {
        let bytes = self.to_bytes();
        for item in bytes.iter() {
            item.encode(encoder)?;
        }
        Ok(())
    }
}

One thing I did find on the decoding side that often I have a need to take n number of bytes, which is dynamic based upon prior decoding, e.g:

let key_length = u8::decode(decoder)?;

let mut string_vec = vec![];
for _ in 0..key_length {
    string_vec.push(u8::decode(decoder)?);
}
let key  = String::from_utf8(string_vec).unwrap();

I wonder what the general thoughts of the maintainers would be on adding an additional function to the Decoder trait, decode_n_bytes:

    fn decode_n_bytes(&mut self, n:usize) -> Result<Vec<u8>, DecodeError>;

Which would enable implementing custom Decoders with a bit more ease? Happy to look into this myself and try raise a PR.

VictorKoenders commented 5 months ago

Unfortunately we cannot provide a method that returns Result<Vec<_>, _> as that is not available on no_std platforms, and having a method that's only conditionally available sounds annoying.

We're considering switching to an API more similar to https://github.com/rust-lang/rust/issues/78485, but that has not stabilized yet

Sammo98 commented 5 months ago

Ah okay that's interesting, hadn't considered it in a no_std context, could having it return [u8, n] (where n is not const) work? Or otherwise is there another way around reading n number of bytes, or is my above implementation how one would be expected to do that.

VictorKoenders commented 5 months ago

Returning [u8; N] would only work with a const N: usize parameter

With the read_buf feature I linked above, there are some possibilities to read exactly N bytes, even in no_std contexts.

Rendered version: https://github.com/rust-lang/rfcs/blob/master/text/2930-read-buf.md#summary

Sammo98 commented 5 months ago

OKay great, thanks for the info, will track this closely!