tokio-rs / bytes

Utilities for working with bytes
MIT License
1.86k stars 278 forks source link

Reuse capacity when possible in `<BytesMut as Buf>::advance` impl #698

Closed paolobarbolini closed 4 months ago

paolobarbolini commented 4 months ago

While thinking about usage of BytesMut in my own code I've realized sometimes the following pattern (explained by the following pseudocode) is encountered:

let mut buf = BytesMut::with_capacity(8196);

loop {
    read_into_bufmut(&mut buf);

    if some_condition {
        // consume the data and then discard it
        let consumed = process_data(&buf);
        buf.advance(consumed);
        // in most cases the above `advance` ends-up consuming the entire length of `buf`
    } else {
        // consume the data as `Bytes`
        ship_data(buf.split().freeze());

        // (this is just to explain that `VecDeque` wouldn't be efficient here)
    }
}

In this particular example it would be more efficient to buf.clear() when consumed == buf.len(), instead of calling advance and eventually ending-up going through the allocation machinery.

This PR makes advance do it automatically for cases in which it's safe to do (which is why I didn't put it inside advance_unchecked). I'm not sure the Buf API allows it, the only thing I've found is #[must_use = "consider BytesMut::advance(len()) if you don't need the other half"] on BytesMut::split().

paolobarbolini commented 4 months ago

Would it make more sense to move this check to advance_unchecked?

I'm not quite sure how to structure this because the thing with advance_unchecked is that it gets called by split_to and split_off. These functions expect advance_unchecked to actually advance the position, otherwise they would end up with the two instances of BytesMut having overlapping ranges.

Darksonn commented 4 months ago

The conflict with split_to and split_off is a good reason to not put it in advance_unchecked.

paolobarbolini commented 4 months ago

Good idea. Fixed!

camshaft commented 1 month ago

This change broke our usage of BytesMut in s2n-quic, where we have a similar expectation:

These functions expect advance_unchecked to actually advance the position, otherwise they would end up with the two instances of BytesMut having overlapping ranges.

In our case, we expect that advance actually reduces the overall capacity of the chunk. There's also no efficient way to work around this, since the only option is to call split_to and discard the result.

Darksonn commented 1 month ago

That's unfortunate, we may have to revert this change. Can you open a bug report with details about this regression?

camshaft commented 1 month ago

Opened #725