rust-lang / futures-rs

Zero-cost asynchronous programming in Rust
https://rust-lang.github.io/futures-rs/
Apache License 2.0
5.43k stars 628 forks source link

Add AsyncRead::poll_read_buf based on RFC 2930 #2209

Open taiki-e opened 4 years ago

taiki-e commented 4 years ago

Add the poll_read_buf method to AsyncRead based on the API proposed by https://github.com/rust-lang/rfcs/pull/2930 (tracking issue: rust-lang/rust#78485).

pub trait AsyncRead {
    fn poll_read_buf(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<io::Result<()>>;
}

pub struct ReadBuf<'a> {
    buf: &'a mut [MaybeUninit<u8>],
    filled: usize,
    initialized: usize,
}

impl<'a> ReadBuf<'a> {
   // ...
}

This replaces the existing read-initializer API, but it's not yet stable, so maybe it needs to added as an unstable feature.

Nemo157 commented 4 years ago

Also poll_read_buf_vectored, and a bunch of utilities (and update a bunch of the existing utilities to use it where possible).

Personally I think this should wait for an unstable ReadBuf available in std, and definitely done as an unstable feature (I doubt the viability of exposing this stably now with a local ReadBuf and updating to use std's ReadBuf in the future without breaking changes).

Here's an existing, very quick, probably buggy, implementation from a while ago (so has an out-of-date ReadBuf API).

dignifiedquire commented 4 years ago

The RFC was merged, so seems it makes sense to start getting this in now?

Nemo157 commented 4 years ago

I still think there's no point in copy-pasting ReadBuf and we may as well wait to use std::io::ReadBuf once that's merged (to avoid multiple breaking changes). Tokio 0.3 already has the ecosystem experimenting with this design which will hopefully flush out enough issues with it.

EDIT: I also see an in-progress implementation, so I don't think it'll be too long till it lands on nighly.

dignifiedquire commented 4 years ago

tokio 0.3 already has the ecosystem experimenting with this design which will hopefully flush out enough issues with it.

Isn't the version from Tokio inconsistent with the rfc, as they use ReadBuf directly in the read method?

dignifiedquire commented 4 years ago

I still think there's no point in copy-pasting ReadBuf and we may as well wait to use std::io::ReadBuf once that's merged (to avoid multiple breaking changes).

Well that means we can only ship it on nightly and not on stable, which seems unfortunate but fine I guess.

Nemo157 commented 4 years ago

Isn't the version from Tokio inconsistent with the rfc, as they use ReadBuf directly in the read method?

std::io::Read must maintain backwards compatibility with existing implementations by adding a second method for ReadBuf, which will result in a relatively confusing API requiring you to delegate in every implementation. There's no restriction like that on AsyncRead so it can go with only a ReadBuf based API (it's possible to implement and call it the same as &mut [u8] with just a tiny bit of ceremony).

yoshuawuyts commented 4 years ago

This is likely material for a meeting, but the premise that we may issue breaking changes to AsyncRead does not mean that we should. Not only would this proposal be hugely disruptive; parity between Read and AsyncRead enables us to solve issues in both traits in a uniform way. From the RFC:

Users shouldn't be required to manually write a version of read that delegates to read_buf. We should be able to eventually add a default implementation of read, along with a requirement that one of read and read_buf must be overridden.

If we expect we can implement this for Read we should be able to implement this for AsyncRead. After which the difference in the way the traits are are used would forever remain an unfortunate gotcha. Even if we expect the final solution to require as-of-yet-unspecified language additions, we could always introduce lints against incorrect usage before then.

For clarity, this is what I'm currently discussing:

// ✔ The `Read` baseline, `read_buf` extends the base trait
trait Read {
    fn read(&mut self, buf: &mut [u8]) -> Result<usize>;
    fn read_buf(&mut self, buf: &mut ReadBuf<'_>) -> io::Result<()>;
}

// ✔ Similar to `Read`, `read_buf` extends the `AsyncRead` base trait
trait AsyncRead {
    fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut [u8]) -> Poll<Result<usize>>;
    fn poll_read_buf(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<Result<usize>>;
}

// ❌ Breaks `AsyncRead`, usage patterns no longer shared with `Read`
trait AsyncRead {
    // unclear how to read into `&mut [u8]`
    fn read(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<Result<usize>>;
}

As what we're designing here is expected to serve as the foundation for future stdlib extensions, we should emphasize long-term consistency over limited short-term convenience. The more usage patterns align, the easier it'll be to evolve the stdlib over time. And for Rust programmers to internalize how to use it. Which is why I am strongly in favor of introducing poll_buf_read as an additive change only.

dignifiedquire commented 4 years ago

I agree with @yoshuawuyts and really hope we can keep the existing AsyncRead apis. Changing these will break large parts of the async ecosystem and cost us a lot of good will from folks that suddenly need to change their implementations.

I think it is great that the RFC already figured out how to do this in a non breaking way, so we should leverage that as much as we can.

algesten commented 4 years ago

I also agree with @yoshuawuyts. I believe the goal is to get AsyncRead into std, in which case parity between Read and AsyncRead is important.

Nemo157 commented 4 years ago

This is likely material for a meeting, but the premise that we may issue breaking changes to AsyncRead does not mean that we should. Not only would this proposal be hugely disruptive

We have one guaranteed breaking change coming up, when AsyncRead moves into std. I suppose what you're saying is that you want this to be a simple s/futures::io/std::io breaking change rather than one that requires implementation changes? My hope would be to get this API available on nightly (and minimize breaking changes to it on nightly so that it can be experimented with relatively reliably) and then coordinate with the eventual async IO RFC to have any actual switchover occur together as part of that one breaking change.

It could even be a two-step process with a transition phase where we add a second default-implemented method as you outlined, during the time std::io::ReadBuf is stable and std::io:AsyncRead is unstable. That would give time for implementors to move to implementing AsyncRead::poll_read_buf and they just drop their AsyncRead::poll_read implementation during the breaking change.

For clarity my expectation would be to keep the naming consistent (this is something I consider to be a bug in Tokio 0.3's translation, I don't see any discussion of it in the PR and it might be related to unfortunate conflicts with bytes::Buf related methods):

trait AsyncRead {
    fn poll_read_buf(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<Result<()>>;
}

The answer to the "unclear how to read into &mut [u8]" would be the same as now, AsyncReadExt::{read, read_exact} depending on if you want to fully fill the slice (and to read into a ReadBuf within async context, AsyncReadExt::read_buf). If necessary we could also have AsyncReadExt::poll_read for non-async context, but I think it should be avoided, it's possible to very easily translate usage using only the safe APIs of ReadBuf (which can still give the performance advantages of uninit memory).

[...] parity between Read and AsyncRead enables us to solve issues in both traits in a uniform way. From the RFC:

This is a "Future Possibility", there's no guarantee that it is going to be implemented. If we just avoid the issue in the first place then there's no need to solve it at all.

One of my hopes is that as soon as Read::read_buf is stabilized we get a Clippy lint recommending to implement it preferentially over Read::read, I could even see eventually deprecating it from being implemented somehow. I have found a ReadBuf style API just such a better one to work with that I think we should go all in on it.

yoshuawuyts commented 4 years ago

@Nemo157 thanks for clarifying! — I'm glad we're in agreement on the shape and name of the API. A deviation in usage patterns between the two APIs was my primary concern, and that's been alleviated.

On the question of which method should be implementable by default on AsyncRead I feel less strong – good arguments seem to exist for either direction. I think overall you lay out a pretty convincing argument to default to poll_read_buf.

We have one guaranteed breaking change coming up, when AsyncRead moves into std. I suppose what you're saying is that you want this to be a simple s/futures::io/std::io breaking change rather than one that requires implementation changes?

@Nemo157 I suspect we could take the same approach here as what we're looking to do for Stream: at compile-time detect whether std::stream::Stream is reachable in a build.rs file. For older Rust versions that don't define Stream we can provide our own definition. I think this could even be applied to resolve the overlapping definitions of Stream::next and StreamExt::next. Which if accurate would mean upgrading the ecosystem to use std::stream::Stream could be done without any breaking changes at all. So I'm not sure if the transition of AsyncRead to be part of the stdlib will require any breaking changes unless we decide to.

taiki-e commented 4 years ago

I think this could even be applied to resolve the overlapping definitions of Stream::next and StreamExt::next.

No, this is breaking change. If you have code that imports only StreamExt, doing this will render .next() unusable.

Nemo157 commented 4 years ago

So I'm not sure if the transition of AsyncRead to be part of the stdlib will require any breaking changes unless we decide to.

If this is possible I'm on board with going the non-breaking route, just if we have an upcoming breaking change I think we should consider hard on whether it's worth going to poll_read_buf-only at the same time.

taiki-e commented 4 years ago

(Marked as blocked until implemented in std. The API proposed in RFC has some issues and is likely to change when it is actually implemented in std.)

erickt commented 2 years ago

FYI, it looks like the initial implementation of ReadBuf landed a few days ago in nightly https://github.com/rust-lang/rust/pull/81156. The tracking ticket https://github.com/rust-lang/rust/issues/78485 is still open though, so no idea when it'll stabilize.

oxalica commented 1 year ago

I found that std's read_buf is not (and will not) be designed to be compatible with completion based I/O. https://github.com/rust-lang/rust/issues/78485#issuecomment-1115818564 If we just copy the borrowed ReadBuf type from sync read_buf, it may cause troubles in the future when io_uring is more and more wildly used.