sharksforarms / deku

Declarative binary reading and writing: bit-level, symmetric, serialization/deserialization
Apache License 2.0
1.11k stars 54 forks source link

Is there a faster way to read large Vec<u8>? #462

Open MeguminSama opened 1 month ago

MeguminSama commented 1 month ago

At the moment, say I have a struct like this:

#[derive(Debug, DekuRead, DekuWrite)]
#[deku(ctx = "size: usize", ctx_default = "0")]
pub struct Rfc {
    #[deku(bytes_read = "size")]
    pub data: Vec<u8>,
}

The problem, is that deku seems to loop through the reader for each u8 in bytes_read. This causes it to be very slow on large vectors. #[deku(read_all)] and #[deku(count = "size")] are also very slow.

At the moment, we're using our own read function to do something like this:

match reader.read_bytes(size, &mut buf) {
    Ok(ReaderRet::Bytes) => Ok(Rfc { data: buf }),
    _ => {...}
}

Which is significantly faster.

But I was wondering if there was a built-in way to do this with deku, instead of deku looping over each u8?

If this isn't a feature currently, I might consider implementing it if it's something you'd want in deku.

Thanks!

wcampbell0x2a commented 1 month ago

For read_all performance, check out the following MR. https://github.com/sharksforarms/deku/pull/441

Since deku makes small repeated reads, using a https://doc.rust-lang.org/std/io/struct.BufReader.html should reduce the read overhead.

MeguminSama commented 1 month ago

Thanks for getting back so quickly!

At the moment, our reader is already using a BufReader. We tried doing this in an attempt to speed it up, but unfortunately it's still much too slow when reading the vectors compared to our own read function.

Would some kind of #[deku(read_buffer = "size")] attribute be something you'd consider? Or is this out of scope for deku?

wcampbell0x2a commented 1 month ago

Definitely try out the merge request, it's really slow without that.

Would some kind of #[deku(read_buffer = "size")] attribute be something you'd consider? Or is this out of scope for deku?

Sure, I don't have the code in front of me, but I think we only store leftover as a u8, so we would need to store the leftovers in a Vec<u8> if needed. I'd like to use that only if you use read_buffer, since for embedded platforms you don't want allocations all the time.

wcampbell0x2a commented 1 month ago

I also don't know, it could be an improvement in our impl of Vec, I think it reads and evaluates one at a time currently.

MeguminSama commented 1 month ago

I will take a look, thanks :)