rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.93k stars 12.53k forks source link

Tracking issue for vectored IO support #58452

Closed sfackler closed 5 years ago

sfackler commented 5 years ago

This covers Read::read_vectored, Write::write_vectored, IoVec, and IoVecMut.

abonander commented 5 years ago

@sfackler is this for an open PR or pending RFC? I can't find anything relevant.

jonas-schievink commented 5 years ago

Introduced in https://github.com/rust-lang/rust/pull/58357

withoutboats commented 5 years ago

I'd like to at least investigate the DST solution before we stabilize to see what would be the blockers. It would be a pleasant outcome to have these types written &'a IoVec and &'a mut IoVec instead of having the two lifetime-parameterized types.

withoutboats commented 5 years ago

Looked into what making this into a DST would entail; it looks like what these need is some way to specify the metadata type, because on some platforms the metadata is not the same ABI as a usize (specifically on 64-bit windows it seems to be a u32).

sfackler commented 5 years ago

You also need control over the ordering - the length field comes before the pointer in a WSABUF.

withoutboats commented 5 years ago

Oh jeez, I'm not sure that we can accomplish having a T in which the first word of &T is not the address.

sfackler commented 5 years ago

Yeah. The iovec crate hacked around this by having the DST be a slice where the length was secretly the pointer and the pointer was secretly the length, but it's then unsound to have an empty iovec. It just barely happens to work. If we targeted a slightly weirder platform with e.g. a 32 byte length and 4-byte aligned pointers we'd be SOL.

withoutboats commented 5 years ago

I mean isn't that hack broken as soon as someone writes as *const _ and they get the first word? If we had language support we'd fix that, but I suspect we will discover that people are relying on the first word being the address in ways we can't automatically patch up.

jethrogb commented 5 years ago

There is a test in tcp:

let a = [];
let b = [10];
let c = [11, 12];
t!(s1.write_vectored(&[IoVec::new(&a), IoVec::new(&b), IoVec::new(&c)]));

let mut buf = [0; 4];
let len = t!(s2.read(&mut buf));
// some implementations don't support writev, so we may only write the first buffer
if len == 1 {
    assert_eq!(buf, [10, 0, 0, 0]);
} else {
    assert_eq!(len, 3);
    assert_eq!(buf, [10, 11, 12, 0]);
}

However, the first buffer is the empty buffer, so some platforms don't write anything at all! From src/libstd/sys/sgx/net.rs:

    pub fn write_vectored(&self, buf: &[IoVec<'_>]) -> io::Result<usize> {
        let buf = match buf.get(0) {
            Some(buf) => buf,
            None => return Ok(0),
        };
        self.write(buf)
    }

The documentation for write_vectored leaves something to be desired:

Data is copied to from each buffer in order, with the final buffer read from possibly being only partially consumed. This method must behave as a call to write with the buffers concatenated would.

I'm not exactly sure what the expected behavior here is.

sfackler commented 5 years ago

Oh, oops, I forgot to fix the SGX implementation. It should find the first nonempty buffer: https://github.com/rust-lang/rust/blob/master/src/libstd/io/mod.rs#L535-L537

sfackler commented 5 years ago

59009

carllerche commented 5 years ago

I plan on issuing breaking change releases of a bunch of crates around the time async / await is stabilized. This will include crates that currently depend on iovec. It would be nice to be able to switch to the types in std. That would require iovec hitting stable around the same time as Future.

alexcrichton commented 5 years ago

@rfcbot fcp merge

I continue to at least personally be confident in this approach for implementing vectored I/O in the standard library, so I'd like to propose that we stabilize this. With https://github.com/rust-lang/rust/pull/59852 I believe we've also implemented it for all necessary types and such in the standard library.

rfcbot commented 5 years ago

Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

SimonSapin commented 5 years ago

As of this writing, these are the APIs in master that point here:


pub struct IoVec<'a>(…);
pub struct IoVecMut<'a>(…);

impl<'a> fmt::Debug for IoVec<'a> {…}
impl<'a> fmt::Debug for IoVecMut<'a> {…}

impl<'a> IoVec<'a> {
    pub fn new(buf: &'a [u8]) -> Self {…}
}
impl<'a> IoVecMut<'a> {
    pub fn new(buf: &'a mut [u8]) -> Self {…}
}

impl<'a> Deref for IoVec<'a> {
    type Target = [u8];
}
impl<'a> Deref for IoVecMut<'a> {
    type Target = [u8];
}
impl<'a> DerefMut for IoVecMut<'a> {…}

pub trait Write {
    fn write_vectored(&mut self, bufs: &[IoVec<'_>]) -> Result<usize> {…}
}
pub trait Read {
    fn read_vectored(&mut self, bufs: &mut [IoVecMut<'_>]) -> Result<usize> {…}
}
rfcbot commented 5 years ago

:bell: This is now entering its final comment period, as per the review above. :bell:

withoutboats commented 5 years ago

I wish we could have these types be written as DSTs, but it doesn't seem realistic (given that the pointer is the second value on some platforms) or worth blocking on.

withoutboats commented 5 years ago

@rfcbot concern name

I want to raise a question actually: is iovec the right name for these types? We've just taken the name used by UNIX, which is something in the past we moved away from (renaming the various fs functions for example). In particular, I think IoVec is somewhat confusing because in Rust its pretty established that Vec specifically means our heap allocated buffer type. Perhaps it would be wise to rename them?

alexcrichton commented 5 years ago

That's true! @withoutboats do you have ideas for alternative names? I think one possibility is IoBuf but that doesn't jive well with PathBuf which is allocated on the heap as well. We could go with longer names as well like IoSlice or VectoredSlice as well perhaps?

Amanieu commented 5 years ago

IoGather/IoScatter? And rename the functions to write_gather/read_scatter.

sfackler commented 5 years ago

WsaBuf!

But yeah, something that indicates that it's a slice seems like a good idea.

SimonSapin commented 5 years ago

+1 for IoSlice, as it converts to and from a slice (of bytes).

I’m not sure how Vectored would be meaningful: as I read it the vec in iovec refers to a vector of bytes, not to the fact that you typically use more than one of them at a time. In the man page:

The pointer iov points to an array of iovec structures

… it’s each item of the “outer” array that’s called iovec, not the array itself.

sfackler commented 5 years ago

IoSlice/IoSliceMut or GatherBuf/ScatterBuf seem reasonable to me.

alexcrichton commented 5 years ago

I've personally always been slightly confused with the gather/scatter terminology although if I sit down and think hard about it then it makes sense. I'm warming up personally to IoSlice and IoSliceMut, and I think that'd jive well with the types themselves as we in theory want the APIs to take &[&[u8]] and &mut [&mut [u8]], but we need a platform-specific representation of the inner elements hence the Io part, but they're still slices

withoutboats commented 5 years ago

Yea I was thinking either IoSlice or IoBuf and the PathBuf thing puts it over the edge to the former I think.

I'd propose we change the name to ioslice and also make sure the docs for the two types are clear that they correspond to iovec on unix and wsabuf on windows.

sfackler commented 5 years ago

👍

The docs do already mention the iovec/WSABUF correspondence, but it could probably be called out better: https://doc.rust-lang.org/std/io/struct.IoVec.html

withoutboats commented 5 years ago

@rfcbot resolve name

rfcbot commented 5 years ago

:bell: This is now entering its final comment period, as per the review above. :bell:

carllerche commented 5 years ago

If the type does not use Vec in the name, should the fns use a different suffix than vectored?

sfackler commented 5 years ago

read_scatter and write_gather would probably be the other option.

alexcrichton commented 5 years ago

I would disagree that we need to rename the _vectored suffix on these methods because "vectored" is pretty different than Vec which is very well known in Rust. AFAIK the terminology for this form of I/O is either "vectored I/O" or "scatter/gather I/O", so having a suffix of "vectored" I don't think will lead to undue confusion that Vec should be involved somehow, especially when we already have read_to_vec which hasn't been historically confused for vectored I/O

rfcbot commented 5 years ago

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

sfackler commented 5 years ago

So it sounds like the changes we want to make are IoVec/IoVecMut -> IoSlice/IoSliceMut, and we'll retain the current name of the methods. Is that correct?

I'll make a stabilization PR if so.

alexcrichton commented 5 years ago

That was my read on this as well! And thanks @sfackler!

RReverser commented 4 years ago

Would it be worth to add write_vectored_all or is it out of scope of stdlib API?