rust-lang / libs-team

The home of the library team
Apache License 2.0
123 stars 19 forks source link

Enable specialisation of `std::io::copy` for non stdlib types #419

Closed NobodyXu closed 1 month ago

NobodyXu commented 2 months ago

Proposal

Problem statement

std::io::copy contains specialisation for File, by getting the file type and then infer if any optimization (splice/sendfile/copy_file_range) can be applied.

Currently this optimization cannot applied to types outside of stdlib, because does not know if their Read/`Write implementation uses the fd or not.

This is a missing opportunity.

Motivating examples or use cases

For examples, crates might wrap I/O type in stdlib to provide their own abstraction, however these abstractions cannot use the specialisation in std::io::copy and thus would have surprising performance difference from using the underlygin stdlib I/O directly.

It would confuses the users as Rust promises zero-cost abstraction and one would assume there's something wrong with their implementation, causing the performance loss.

Solution sketch

The reason these specialisation cannot be applied to non stdlib-types, is that we cannot guarantee their Read/Write implementation uses the fd returned by AsRawFd.

So if we have new unsafe traits that asserts that the Read

/// Asserts that the Read implementation reads from the fd returned by [`AsRawFd::as_raw_fd`] directly,
/// without any processing or buffering
unsafe trait ReadFromFd: Read + AsRawFd {}

/// Asserts that the Read and BufRead implementation reads from the fd returned by AsRawFd::as_raw_fd directly,
/// without any processing but can have buffering.
unsafe trait BufReadFromFd: BufRead + AsRawFd + !ReadFromFd {
    /// Return the internal buffered read.
    fn get_buffer(&mut self) -> &[u8];
}

/// Asserts that the Write implementation writes to the fd returned by [`AsRawFd::as_raw_fd`] directly,
/// without any processing, but can have buffering if [`Write::flush`]
/// flush all of them
unsafe trait WriteFromFd: Write + AsRawFd {}

We could go one step further, and add methods for zero-copy:

fn zero_copy(reader: &mut R, writer: &mut W) -> io::Result<u64>
where
    R: ReadFromFd,
    W: WriteFromFd;

fn zero_copy_buf(reader: &mut R, writer: &mut W) -> io::Result<u64>
where
    R: BufReadFromFd,
    W: WriteFromFd;

which will help fix rust-lang/rust#128300

Alternatives

Maybe stablise min specialisation and expose more internals of std::io::copy?

Related

202

NobodyXu commented 2 months ago

cc @the8472 I think we could fix rust-lang/rust#128300 while also enabling zero-copy on non stdlib types?

Amanieu commented 1 month ago

We discussed this in the libs-api meeting today. We believe that the ability to efficiently copy data from one FD to another is useful, but don't think that this interface is the right way to expose this. Instead we would like to see a unix-specific function that takes 2 BorrowedFds and automatically finds the most efficient way to transfer the contents of one into the other.