rust-lang / libs-team

The home of the library team
Apache License 2.0
110 stars 18 forks source link

Enable accessing written data in a `BorrowedCursor` #367

Closed a1phyr closed 2 months ago

a1phyr commented 3 months ago

Proposal

Problem statement

Quoting documentation of BorrowedCursor:

Once data is written to the cursor, it becomes part of the filled portion of the underlying BorrowedBuf and can no longer be accessed or re-written by the cursor.

However, doing so may be really useful, for example in Read wrappers that read back the data read in the inner reader. With the current API, read_buf can only be implemented by initializing the whole buffer and forwarding to read or using unsafe code to craft a new BorrowedCursor.

Motivating examples or use cases

A crc32 checker example simplified from zip crate (original source):

pub struct Crc32Reader<R> {
    inner: R,
    hasher: Hasher,
    check: u32,
}

impl<R> Crc32Reader<R> {
    fn check_matches(&self) -> bool {
        self.check == self.hasher.clone().finalize()
    }
}

impl<R: Read> Read for Crc32Reader<R> {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
        let count = self.inner.read(buf)?;
        if count == 0 && !buf.is_empty() && !self.check_matches() {
            return Err(io::Error::new(io::ErrorKind::Other, "Invalid checksum"))
        }
        self.hasher.update(&buf[..count]);
        Ok(count)
    }

    fn read_buf(&mut self, mut cursor: BorrowedCursor<'_>) -> io::Result<()> {
        let written = cursor.written();
        self.inner.read_buf(cursor.reborrow())?;
        if cursor.written() == written && cursor.capacity() != 0 && !self.check_matches() {
            return Err(io::Error::new(io::ErrorKind::Other, "Invalid checksum"))
        }
        // We can't write this line
        // self.hasher.update(cursor.written_data());
        Ok(count)
    }
}

In this code, a specialized read_buf implementation that forward to self.inner.read_buf() is desirable, but not really possible without unsafe code.

Solution sketch

Add new method to BorrowedCursor that creates a BorrowedBuf from it, which would allow reading back the written data (not tested):

impl BurrowedCursor<'_> {
    fn unfilled_buf(&mut self) -> BorrowedBuf<'_> {
        // Note: this function can already be written using only public (unsafe) APIs.
        let init = self.buf.init - self.buf.filled;

        BorrowedBuf {
            buf: unsafe { self.as_mut() },
            filled: 0,
            init,
        }
    }
}

With this, read_buf function from the previous example could be written as:

impl<R: Read> Read for Crc32Reader<R> {
    fn read_buf(&mut self, mut cursor: BorrowedCursor<'_>) -> io::Result<()> {
        let mut buf = cursor.unfilled_buf();
        self.inner.read_buf(buf.unfilled())?;

        if buf.len() == 0 && buf.capacity() != 0 && !self.check_matches() {
            return Err(io::Error::new(io::ErrorKind::Other, "Invalid checksum"))
        }
        self.hasher.update(buf.filled());
        let init = buf.len();
        cursor.advance(init);
        Ok(())
    }
}

Alternatives

Links and related work

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

Second, if there's a concrete solution:

joshtriplett commented 2 months ago

The approach of turning a BorrowedCursor into a new BorrowedBuf seems like a good one. We discussed this in this week's libs-api meeting and agreed that we want to approve this.

I also think with_unfilled_buf is a good helper method that will be less error-prone. The documentation for unfilled_buf should point to that as the preferred alternative.