BurntSushi / bstr

A string type for Rust that is not required to be valid UTF-8.
Other
744 stars 51 forks source link

Support for the Bytes crate #184

Open Sytten opened 1 week ago

Sytten commented 1 week ago

Hi!

Just an idea, I am starting to move code from using Arc<Vec<u8>> to using the Bytes crate (https://github.com/tokio-rs/bytes). This means losing a lot of the capabilities provided by bstr since the traits like ByteSlice always return &[u8] but they could just as easily return Bytes objects. Let me know what you think, happy to pinch in for the implementation. It should be fairly similar to ByteSlice just without the references.

Thanks

BurntSushi commented 1 week ago

Can you say more about what kinds of changes you are proposing? API signatures would be helpful.

BurntSushi commented 1 week ago

It doesn't have to be complete.

BurntSushi commented 1 week ago

And please also show the alternative. What does code look like if support isn't added to this crate? And why was existing support okay for Arc<Vec<u8>> but not for Bytes? After all, the existing APIs don't return Arc<Vec<u8>> either.

In general, you are proposing a major feature addition to this library. I need to see it spelled out more.

Sytten commented 1 week ago

I probably misspoke. A common way to deal with large buffers is to do something like Arc<[u8]> or Arc<Vec<u8>> to minimize the cloning. The Bytes crate is interesting because it allows you to have multiple references to an underlining buffer without lifetime. It uses a custom reference counting similar to Arc to achieve that. This is super useful in async code where in a lot of cases stuff needs to be 'static.

An example of what it would look like:

pub trait ByteBytes {
    #[inline]
    fn split_once_str<B: ?Sized + AsRef<[u8]>>(
        &self,
        splitter: &B,
    ) -> Option<(Bytes, Bytes)>;
}

The idea is you can return new Bytes objects which lifetime isn't tied to the parent.

BurntSushi commented 1 week ago

I'm familiar with the bytes crate itself.

It seems like you would run into the same problem with Arc<Vec<u8>>.

Sytten commented 1 week ago

Not really, lets take a common example: I need to split the body of an HTTP request. With Arc<Vec<u8>> I simply can't right now since a split returns a slice. With Bytes, I get a new Bytes object not tied to the lifetime of the caller which I can use independently from the original Bytes object.