tokio-rs / bytes

Utilities for working with bytes
MIT License
1.91k stars 288 forks source link

Export `BytesMut::from_vec(vec)` #615

Open al8n opened 1 year ago

al8n commented 1 year ago

Hi, can we expose BytesMut::from_vec? Which can help avoid copying when building a BytesMut from Vec<u8>

Darksonn commented 7 months ago

That seems reasonable to me. We can add it as a From impl.

braddunbar commented 6 months ago

The comments on from_vec call out the fact that the allocation strategy may change at some point, making this a more expensive operation because it would then involve a copy. I'm not sure how likely that is and I don't have any opinions about exposing it, but I wanted to call it out here since this would be a departure from that stance.

https://github.com/tokio-rs/bytes/blob/327615e5d4ba27e9647734d83ef9ad88d7dd8a38/src/bytes_mut.rs#L824-L829

Darksonn commented 6 months ago

Yes, good point, thanks.

hackerbirds commented 4 months ago

While it's true that the allocation behind Bytes could change in the future and suddenly make the conversion slower, as a regular user who just needs to work with libraries using Bytes, I don't think it's a good decision to not have a From<Vec<u8>> implementation.

Some libraries use Bytes, others use Vec<u8>, and right now the easiest way I found to convert from a Vec<u8> into Bytes is to deref it into &[u8], and then use From<&[u8]>, which calls to_vec() internally and makes a clone/copy anyway. Given that there is no real other choice (at least to my knowledge, feel free to tell me if there's a better way), I suspect this is what most users also end up doing.

I don't see how that's any better than letting us have a From<Vec<u8>> which would be faster at least today, even in the future where this operation ends up slow. Sometimes we just have no choice to convert back and forth between the types to use foreign libraries. If the sudden performance hit really a concern, then perhaps let from_vec public instead of making a From, and document its behavior publicly (and not privately).

EDIT: Meant BytesMut, sorry.

demosdemon commented 6 days ago

+1 for From<Vec<u8>> and maybe From<String>.

But I'm curious why From<Vec<u8>> is explicitly called out but the documentation for From<Bytes> says it avoids a copy if the Bytes is unique.

https://github.com/tokio-rs/bytes/blob/2d996a2b419ad1a29d65ba6c7e82cfe9c733595c/src/bytes.rs#L1030-L1050

And, there is a From<Vec<u8>> (and From<String>) on the Bytes. The From<Vec<u8>> for Bytes impl makes no such promise, but the API exists and performs a zero-copy instantiation of Bytes thus the contract has been set.

https://github.com/tokio-rs/bytes/blob/2d996a2b419ad1a29d65ba6c7e82cfe9c733595c/src/bytes.rs#L964-L997

So, transitively, From<Vec<u8>> for BytesMut should be simply BytesMut::from(Bytes::from(vec)) and ultimately zero-copy.