tokio-rs / bytes

Utilities for working with bytes
MIT License
1.86k stars 278 forks source link

Allow reclaiming the current allocation #686

Closed shahn closed 2 months ago

shahn commented 5 months ago

This is based on #680, where it was noted that it is hard to use BytesMut without additional allocations in some circumstances.

shahn commented 5 months ago

This is a draft to get some comments on whether such an API addition would be acceptable

braddunbar commented 5 months ago

This looks a lot like BytesMut::reserve with the allocating parts cut out, which sounds a lot like a special case of BytesMut::try_reserve(ORIGINAL_CAPACITY). What would y'all think of a more general implementation in that vein?

shahn commented 5 months ago

I think it is hard to know what the original capacity is, generally, as it could've been influenced by lots of hidden operations before. In addition, try_reclaim also doesn't move any bytes internally, with the goal of being very cheap to call. To me, the allocation behaviour of the entire crate is sometimes a bit hard to predict, so a separate function for it seems sensible

braddunbar commented 5 months ago

I think it is hard to know what the original capacity is, generally, as it could've been influenced by lots of hidden operations before.

That's true, but presumably you know how much data you'll need to reserve regardless (otherwise dynamic allocation seems like the best strategy). So it doesn't particularly matter if you can reclaim the entire buffer, only if it can fit the data you're receiving. Or am I missing the point here?

In addition, try_reclaim also doesn't move any bytes internally, with the goal of being very cheap to call.

Very cheap relative to what though? Based on your description in https://github.com/tokio-rs/bytes/issues/680, your only other alternative would be to allocate anyway, right? And that's definitely expensive relative to moving bytes. If that's not the alternative, what is?

shahn commented 5 months ago

My usecase is that I have a couple of BytesMuts and I would only allocate more memory for the current BytesMut if I couldn't reclaim the other. Basically, switch between the buffers, filling them up, while the data is processed on another thread. Once all the handles have been dropped I could reclaim the allocation and start filling the other buffer, etc.

shahn commented 3 months ago

rebased this on top of current master, to make it mergable again. But I am also not sure if it is deemed merge-worthy or not :)

There is currently a test case failure because BytesMut::advance() changed from its documented behaviour, but if this is generally acceptable I would try to reconcile that.

braddunbar commented 3 months ago

In my mind, reclaiming memory is always in service of then writing a specific amount of data so I would prefer a try_reserve approach here that allows both at once.

shahn commented 3 months ago

Wouldn't that just be reserve? Or maybe more specifically, where would the try_ part come in?

braddunbar commented 3 months ago

reserve allocates if it can't reclaim enough space for the requested number of bytes. try_reserve would return false rather than allocate, giving you an opportunity to swap buffers instead.

shahn commented 3 months ago

Ah, I misunderstood. Yeah, that'd work as well for me! I will update the branch when I get a chance

Darksonn commented 3 months ago

Naming nit: The name try_reserve is currently used on Vec for an operation that reallocates but returns an error on allocation failure instead of calling abort. So we shouldn't add a method of that name that does something else.

shahn commented 3 months ago

Implemented the suggestion, I kept the name try_reclaim based on Darksonn's suggestion to avoid try_reserve. Thanks a lot for the review and comments

shahn commented 3 months ago

Thanks for the reviews :) Added a commit implementing copying bytes if that is cheap according to reserve. This also simplified the implementation because I could mostly re-use reserve_inner(). Also I added some more testing to test the behaviour when copying bytes.

conradludgate commented 2 months ago

We should fix this expect when allocate is false: https://github.com/shahn/bytes/blob/26c14f000d42f86b8642c75604a4a93dbe603e1b/src/bytes_mut.rs#L669 currently b.try_reclaim(usize::MAX) would panic. while it's maybe an extreme scenario, it's probably wrong to panic.

conradludgate commented 2 months ago

Under the current strategy of inlining, there's only one large instance of reclaim_inner in the resulting binary. Does it maybe make sense to add an intermediate inline target for try_reclaim_inner that can optimise to a much smaller function?

Eg:

#[inline]
pub fn try_reclaim(&mut self, additional: usize) -> bool {
    let len = self.len();
    let rem = self.capacity() - len;

    if additional <= rem {
        return true;
    }

    self.try_reclaim_inner(additional)
}

fn try_reclaim_inner(&mut self, additional: usize) -> bool {
    // inlines
    self.reserve_inner(additional, false)
}
shahn commented 2 months ago

Thank you! Implemented a fix to no longer panic when trying to reclaim large amounts of storage. I am not sure about the inlining, tbh - we could also inline the helper function at the expense of making try_reclaim a bit larger. Did you play with it already to see what would make most sense?

conradludgate commented 2 months ago

Did you play with it already to see what would make most sense?

I only briefly scanned some generated assembly. No benchmarks or profiling. I think keeping what is currently here is fine and won't make a considerable different to performance. If you only use try_reclaim, a smaller function might be better on the instruction cache, but if you use both try_reclaim and reserve, then that optimisation is eradicated. I don't see there being a considerable different in the runtime of either case.