Open Firstyear opened 1 year ago
A Bytes is reference counted, so an allocation is freed once all slices into the allocation are dropped. Even a small slice will keep the entire allocation alive.
When a BytesMut is resized to make more space for writing, then it can reclaim parts given out with split_off
if the reference count is 1. Otherwise, if there are still other handles left, it will make a new allocation, releasing its reference count on the previous allocation.
So what is the correct way to handle this to prevent leaking / over allocating in tokio with a framed codec? The codec docs, and the bytes mut docs show the bytesmut being advanced https://docs.rs/tokio-util/latest/tokio_util/codec/index.html but it's not clear if this the correct way to ensure that memory is freed, or at the least, reused.
This is why I think it would be good for the docs to describe how this works since freeing/reusing the memory is just as important here as the allocations :)
I'm always happy to see more documentation added.
I can't add docs because I don't know how it works ... else I'd be happy to contribute docs. Can you see why I'm a bit stuck here and raised an issue?
That's fine. I agree that adding this documentation would be nice, and I can put it on my list, but it will take me a while to get around to actually doing it.
As for some tips for your own project, the most important guideline is this:
When you use
split_off
to get aBytes
view into your slice and return theBytes
slice, make sure that theseBytes
pieces are not kept around for a long time.
Violating this guideline is the main way that you can end up with using a lot more memory than you actually need, because even a single small Bytes
will keep the full allocation it came from alive, even if that allocation is much larger than the Bytes
itself.
As long as you follow the above rule, using advance
and reserve
in your codec will generally do the right thing and will not result in unbounded memory growth.
Comparing to the MyStringDecoder
example in the docs, this example has no risk of running into problems. That is because the example uses .to_vec()
to copy the data from the BytesMut
into a separate Vec<u8>
(which is then converted to a String
). Because of this copy, it is able to avoid giving out Bytes
slices into the codec's BytesMut
, so there is no risk of violating the rule no matter how the returned strings are used.
Hi there,
tokio_util uses bytes mut in codecs. It is possible in some requests that some excess bytes exist in the bytes mut for the next request, which are preserved.
The codec needs to advance and free the bytes that have now been consumed, but from the docs of bytes mut it's hard to know when this might actually occur or the right way to do it.
For example, if we call
split_off
and we drop the handle, did the capacity associated to that bytesmut get freed? Or does it remain somehow accessible to the other part of the bytes?I think it would be an improvement to these docs to document when bytes are freed by operations and under what circumstances so that authors can know the correct way to handle a long-lived bytesmut struct.
Thanks!