Is there a faster way to read large Vec<u8>?

sharksforarms / deku

Declarative binary reading and writing: bit-level, symmetric, serialization/deserialization

Apache License 2.0

1.14k stars 55 forks source link

Is there a faster way to read large Vec<u8>? #462

Closed MeguminSama closed 1 month ago

MeguminSama commented 3 months ago

At the moment, say I have a struct like this:

#[derive(Debug, DekuRead, DekuWrite)]
#[deku(ctx = "size: usize", ctx_default = "0")]
pub struct Rfc {
    #[deku(bytes_read = "size")]
    pub data: Vec<u8>,
}

The problem, is that deku seems to loop through the reader for each u8 in bytes_read. This causes it to be very slow on large vectors. #[deku(read_all)] and #[deku(count = "size")] are also very slow.

At the moment, we're using our own read function to do something like this:

match reader.read_bytes(size, &mut buf) {
    Ok(ReaderRet::Bytes) => Ok(Rfc { data: buf }),
    _ => {...}
}

Which is significantly faster.

But I was wondering if there was a built-in way to do this with deku, instead of deku looping over each u8?

If this isn't a feature currently, I might consider implementing it if it's something you'd want in deku.

Thanks!

wcampbell0x2a commented 3 months ago

For read_all performance, check out the following MR. https://github.com/sharksforarms/deku/pull/441

Since deku makes small repeated reads, using a https://doc.rust-lang.org/std/io/struct.BufReader.html should reduce the read overhead.

MeguminSama commented 3 months ago

Thanks for getting back so quickly!

At the moment, our reader is already using a BufReader. We tried doing this in an attempt to speed it up, but unfortunately it's still much too slow when reading the vectors compared to our own read function.

Would some kind of #[deku(read_buffer = "size")] attribute be something you'd consider? Or is this out of scope for deku?

wcampbell0x2a commented 3 months ago

Definitely try out the merge request, it's really slow without that.

Would some kind of #[deku(read_buffer = "size")] attribute be something you'd consider? Or is this out of scope for deku?

Sure, I don't have the code in front of me, but I think we only store leftover as a u8, so we would need to store the leftovers in a Vec<u8> if needed. I'd like to use that only if you use read_buffer, since for embedded platforms you don't want allocations all the time.

wcampbell0x2a commented 3 months ago

I also don't know, it could be an improvement in our impl of Vec, I think it reads and evaluates one at a time currently.

MeguminSama commented 3 months ago

I will take a look, thanks :)

wcampbell0x2a commented 1 month ago

@MeguminSama see the latest commits on master, they should solve most or all of your performance problems