tokio-rs / bytes

Utilities for working with bytes
MIT License
1.92k stars 289 forks source link

Consider providing Rewind-style traits #330

Open seanmonstar opened 5 years ago

seanmonstar commented 5 years ago

Sometimes when encoding or decoding, it can be desirable to "rewind" a read or write operation. Some examples:

See also https://github.com/tokio-rs/bytes/issues/299

cc @sfackler

seanmonstar commented 5 years ago

An idea I've thought about for a couple hours is to have a separate trait, if there are buffer types that couldn't support this (do they exist?).

trait Rewind {
    fn position(&self) -> usize;
    fn rewind(&mut self, pos: usize);
}
Ralith commented 5 years ago

A convenient way to implement this would be a transaction-style adapter that stores the position at which it was constructed and can be either aborted or committed.

carllerche commented 4 years ago

As an added note: peeking util fns would be useful to add here.

carllerche commented 4 years ago

Also, instead of "rewinding" it probably should be generalized as scannable or indexable? I imagine buffers backed by linked lists would not implement this trait, but a rope could (despite O(log(n)) complexity).

That said, a linked list backed buffer could implement a strategy where the current position is saved by cloning the "iterator".

carllerche commented 4 years ago

Now I wonder if the trait should be specifically for the case of a buf backed by a single slice. Then there could be helpers like “get_line”

carllerche commented 4 years ago

@seanmonstar points out that a Buf backed by a single slice is "just" a Cursor<&[u8]>. However, we cannot add util fns to this type directly.

An alternate strategy would be for bytes to define its own Cursor type backed by explicitly by a &[u8]:

struct SliceBuf<'a> {
    bytes: &'a [u8],
    pos: usize,
}

This type could implement misc helpers such as "get_line", "peek", ... misc other types could have:

fn as_slice_buf(&self) -> SliceBuf<'_> {
    // ...
}

thoughts? @sfackler @seanmonstar

vincentdephily commented 4 years ago

I've been struggling with this while trying to convert an MQTT decoder to take a dyn Buf instead of a Bytes. MQTT packet is 1 "header" byte then 1-4 "length" bytes then length "payload" bytes (if length > 0). The codec API requires that I consume the bytes for exactly one such packet.

I could use buf.bytes() to peek at the first 1-5 bytes without advancing, but this only works if the buf uses contiguous memory. I could use buf.bytes_vectored() instead, but it is depressingly complicated for the task at hand, and requires std.

I'm guessing that peek_*() functions can be implemented by all backends (?), which seems preferable to adding a RewindableBuf trait (although the later seem to cover more usecases).

The transaction is tempting too (it'd be great if slice() was a Buf method, it'd also neatly solve my "don't read past the first packet" issue), but preventatively opening a transaction to cover a rare worst-case scenario seems a bit wasteful.

little-dude commented 1 week ago

The use case for me would be to rewind to populate a checksum field.

+---+----------+---+---+---+------------+
| A | checksum | B | C | D | payload ...
+---+----------+---+---+---+------------+
^                          ^

The data header contains a bunch of fields. The checksum is computed over the payload and part of the header, so it's the last field I'll set.