Closed steveklabnik closed 6 years ago
I've a partial answer from @dwrensha to :
Does
eventual_io
pay any costs beyondgj
for its greater abstraction via eventual? It'd be interesting if gj and eventual_io could be merged by making the Cap'n proto RPC use eventual_io or something.
It appears gj
wanted single-threaded performance with Rc
s while eventual
wants multi-thread safety with Arc
s and Send
bounds. At least this particular difference could be resolved with higher-kinded polymorphism.
@burdges: it's not just performance. Sharing Rc
data among tasks on a single thread is less error-prone than sharing Arc
data among multiple threads, because the yield points are explicit.
Interesting, I briefly search for Rc
vs Arc
, but found little and didn't try that hard. I probably should've searched the RFCs specifically. Thanks!
I created the crate await as a placeholder for what we would like to have in the core. I also included a partial survery of what's currently out there --in rust--, and what is built on top of what.
After looking at the abstractions the crates provided and the dependencies between them, it seems we have a crate for each kind of abstraction @chpio enumerated (and there seem to be no arguments against that list).
The first thing that strikes out is that mio, or rather, Event Loops, are definitely an important building block throughout the other crates.
There also seems to be a few "shinny" + "shinny-io" crate pairs going around. Ideally all those "*-io" would become unnecessary with the "proper" core abstractions in place.
contributions are welcome, encouraged and very much needed!
@icorderi: I too believe that an executor (what you call event loop) is an important first step.
Recently my employer/client wanted to build a library that doesn't spawn hundreds of threads to maintain a few connections and asked me to design a new solution.
I started taking a look on current Rust abstractions. Pure mio is horrible because it doesn't compose. You cannot create different crates abstracting different protocols because they need to collaborate with each other (e.g. token clashing, message types to mio...). If you create crates low-level enough, you need to do too much boilerplate to integrate them.
Abstractions wrapping mio were too incomplete, too intrusive or mixed up too many concepts within a single library (which breaks single responsibility principle and makes unclear to reason about what happens as there are no clear boundaries).
Then I started researching executors design (an executor is to function execution as an allocator is to allocation) and I've liked the design of this particular proposal:
An executor that allows you to post
functions is important to allow you to guarantee that your asynchronous operations initiating function always return immediately. This is a base guarantee to avoid problems like unfairness, jitter, and starvation. An example of such unfairness happening is given in the paper http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0114r0.pdf (4.5 When scheduling logic is embedded in the language).
I started designing/experimenting a mio wrapper that provide an executor, a timer queue and a socket reactor: https://github.com/vinipsmaker/asio-rs/tree/master/examples
My colleague @canndrew was working on a design based on http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4045.pdf to allow a model that can be adapted to any async paradigm (callbacks, futures, coroutines...), but he eventually said he'd need higher-kinded types.
@vinipsmaker (shameless plug) I'm currently working on an async re-write of tarpc built on top of mio. It allows registering multiple different tarpc services on a single event loop, and multiple clients on a single event loop. Here's a contrived example of a client calling a server which subsequently makes 2 calls to another server and waits until both calls are available before sending a reply to the client.
I'd be interested to hear if that suits your needs. It only currently really works on nightly due to some feature flags being flipped on. Hopefully that won't be the case for long.
@tikue: The model you propose isn't a large improvement over mio's one. I can already use pure mio (if I build a monolith crate and forget about mio's problem of composing interfaces). There are other mio-wrapper and I don't need to build my own. My comment was more in a contribution to a future standardized abstraction within the Rust core library. Within this discussion, I cannot see a hacky way around mio current API as a good solution.
First thing about mio is that it uses a reactive/passive model rather than proactor/active model.
An active model puts the user as the active party. This is an elegant solution to answer questions like "how can I defer acceptance of new connections during server high-load scenarios?"
I also must say from my experience that reactive APIs are harder to design tests for (you need work around to identify which reaction maps to which action from the test).
There is also concerns about coroutines integration. Sure, coroutines are not the solution to everything, but many times blocking code is more readable just because you can keep using the language constructs (if
, while
...) to do control flow. Coroutines are the only solution to asynchronous programming keeping this property that I know of. There is more than one developer who reported to me that Boost.Asio's ability to start an operation within a certain async paradigm (e.g. coroutines) and then move to another model (e.g. callbacks) for the same I/O objects was quite useful.
EDIT:
There's a little inaccuracy around the example of deferring acceptance. mio is capable of that. However, I don't see the same property in your abstraction.
EDIT2:
A little documentation around your documentation would be good too. And I know it's unfair from my part to ask for documentation of your design when I myself have not done the same my design.
@vinipsmaker there's no fundamental reason tarpc can't support deferring acceptance. It's just not a high priority at the moment, and there's a lot of design space to consider. Regarding documentation, feel free to open issues where you believe it's lacking. As the async branch is still undergoing a lot of churn (it hasn't even merged to master yet), writing documentation isn't super-high priority at the moment.
@tikue: you're right. There is no fundamental reason why tarpc can't support deferring acceptance. My previous argument wasn't good and didn't explain the problem.
The problem is how much detail and customization points you need to put in your abstractions. If it's a reactive interface then you start with a small abstraction and start building complexity atop over time. "_I need a pool to recycle objects, let me add a PoolFactory
and a set_factory
... now I want to use objects from this array allocated in the stack and no more than X requests being handled at once, let me add... now I want deferred acceptance... and now..._". You'll put an interface for everything you think about and the interface will never be enough. With an active style, the problem is solved and you don't need to do anything. You don't need to justify with "it's not a priority at the moment" because it's done from the moment the abstraction is usable.
The active style also maps very naturally to some other paradigms like futures and coroutines (accept
which returns a future or suspend the current coroutine).
cc @carllerche
It's dubious that anyone understands this design space well enough right now. Are there any proofs of deadlock freedom for a subset of coroutine flexibility? Just because I need complex shutdown logic do I need to forgo the advantages of restricting to futures? etc.
It sounds like the first question should be : What language features do projects like eventual, gj, rotor, and mioco need to better integrate with one another? At minimum, the lack of higher-kinded polymorphism contributed to the fork between eventual and gj, and appears to be limiting others as well.
It sounds like the first question should be : What language features do projects like eventual, gj, rotor, and mioco need to better integrate with one another?
This would be the executor. Exactly what async IO needs. One thread being shared by multiple tasks. The executor does that, control executions within this thread.
At minimum, the lack of higher-kinded polymorphism contributed to the fork between eventual and gj, and appears to be limiting others as well.
This is my understanding as well.
@burdges I'm trying to answer exactly that.
The gist of the current state is:
/// Abstracting the ability to be resumed when something completes
trait Await<T> {
fn await(ctx: &mut Context) -> T;
}
The Context
is an abstraction related to what @vinipsmaker referred to as the "executor".
The "executor" is essentially a Context
factory that knows how to create Thread
's or mioco coroutines or something else.
/// An execution context
///
/// This is a `Thread` on a threaded model, or a **mioco** `Coroutine`
trait Context {
// Currently playing with what should go in here...
// The primitives to do signaling and cross-context communication should be here
}
Ideally this would be backed by some sugar and allow us to write code like this:
async fn foo_async() -> impl Await<String> {
let s = await TcpStream::bind_async("example.com:8080");
let mut buff = [0u8;1024];
let c = await s.read_async(&mut buff);
String::from_utf8(buff[..c].to_vec()).unwrap()
}
The desugared form can be based on continuations or a state machine. Where the Context
flows through.
/// Example of desugaring the **await** syntax using continuations
pub fn foo_async_desugared<C: Context>(ctx: &mut C) -> impl Await<(String, TcpStream)> {
TcpStream::bind_async("localhost:8123", ctx)
.and_then(move |mut s, ctx| {
let mut buff = [0u8;1024];
s.read_async(&mut buff, ctx)
.map(move |c| (String::from_utf8(buff[..c].to_vec()).unwrap(), s), ctx) }, ctx)
}
@icorderi Is there a good reason that the syntax sugar for making async convenient should be separate from the syntax for something like a Python generator? As I understand it, generators are something at least some people want (including me, kind of).
I know Python ultimately decided that the answer is yes, but I'm not sure what their justifications were or if they apply to Rust. I knew at one point--I think it was mainly that Python is dynamic and you could accidentally return stuff you shouldn't--but haven't looked at the PEP in forever.
Perhaps they need to desugar to different things for zero cost abstraction.
@camlorn, yes, the sugar should not be the same, the two are semantically different.
Think of it without the "asynchronous" bit. [exaggerating] You are essentially saying that because Vec's can hold a single value. We should make all functions just return Vec.
An Await<T>
is parallel to returning a T
, while a Iterator<T>
is parallel to Vec<T>
.
On the other hand, the desugaring of yield x
is quite "similar" to let x = await foo()
, in the sense that both would likely come down to using continuations or state machines.
The compiler's ability to break down a function into continuations / states can be used to support both features.
@icorderi:
/// Abstracting the ability to be resumed when something completes
It's pretty obvious that this is VERY tightly coupled to the concept of coroutines. Therefore, you're not answering the question "What are the core features that we need to better integrate all this async crates?" because you assume coroutines concepts right from the beginning.
While you advocate "awaitable concepts" I advocate "sharing the thread among multiple jobs/tasks/closures/..." (i.e. the executor).
And if you want to propose a coroutine implementation, you do not even need to integrate it with a thread-sharing/executor/... mechanism. The integration should be done by a third piece.
"What are the core features that we need to better integrate all this async crates?"
This is something that Christopher Kohlhoff (Boost.Asio author) has done for C++. He not only provided the foundation, but also provided all these "crates" (callbacks, coroutines, futures and the integration among them). And his model is not too tightly coupled like most would do, but rather extensible and open to new paradigms (you could implement blocking-style if you want or something new you think of). His library is really flexible and you could adopt several models for a project (single-threaded, one executor per thread each processing multiple clients, one thread per client...).
He is now proposing papers to have these abstractions integrated in the core C++ itself and the one thing to notice is that his abstractions are separate abstractions and not a big blob of something that can only be used together. executors, network interfaces, foundation for extensible (i.e. it supports callbacks, futures, coroutines...) async abstractions and stackless coroutines.
If we gonna discuss coroutines, I wish we would discuss coroutines only as these features are building blocks for other abstractions and not a suportive tool you'll only use glued with network code. For instance, in Python you have generators. Some folk from the C++ land are implementing even channels inspired by Go channels on top of coroutine interfaces.
@vinipsmaker,
It's pretty obvious that this is VERY tightly coupled to the concept of coroutines.
I'll have to strongly disagree with that.
Being able to reason abstractly over a function call that will not return immediately/synchronously does not imply or require coroutines to exist.
We do not need a coroutine to accomplish the following:
async fn foo_async() -> impl Await<String> {
let s = await TcpStream::bind_async("example.com:8080");
let mut buff = [0u8;1024];
let c = await s.read_async(&mut buff);
String::from_utf8(buff[..c].to_vec()).unwrap()
}
bind_async
and read_async
operate on top of select/poll constructs. No coroutines involved.
Now, you are obviously not wrong either. If we had the ability to speak abstractly over asynchronous computations, we might as well give them the possibility to be scheduled on an execution context that is not just pure threads. Otherwise, we would be wasting a lot of potential. This is were coroutines come in play.
But I don't expect the std
to have coroutines, just to have an abstraction for an execution context (thread, coroutine) so that external libraries can implement concrete models. The std
would also impl Context for Thread
.
For this to work, we need to make sure that pretty much everything under std::sync
can be implemented on top of this Context
abstraction (or have the *_async()
versions of the blocking calls), so that RWLocks, Mutex, Barrier, channels don't break other implementations of Context
by blocking an entire Thread
.
With this in place, asyncio
can be an independent crate on the nursery implemented on top of an epoll/kqueue
crate (quite probably also in the nursery).
mioco
can then be written on top of that to provide a CoroutineContext
that allows us to run Await
's on them.
futures/eventual/gf/etc...
can all be written on top of Await
and used on either mioco
or Thread
's, it all depends if you schedule the initial async call on a Thread
or on a Coroutine
.
@icorderi:
I quoted "Abstracting the ability to be resumed when something completes" when I stated "it's very tightly coupled to the concept of coroutines". I restate the same phrase again, as you haven't show me how "resumable functions" aren't very related to the concept of coroutines.
And in your example:
async fn foo_async() -> impl Await<String> {
let s = await TcpStream::bind_async("example.com:8080");
let mut buff = [0u8;1024];
let c = await s.read_async(&mut buff);
String::from_utf8(buff[..c].to_vec()).unwrap()
}
You mentioned bind_async
and read_async
, but what is interesting is the await
construct. What does it do? Suspend function to be resumed later? If this is what it does, then this is a coroutine. So yes, coroutines are involved.
Also, I'm not sure I understand your Context
proposal. Could you explain it more, please?
@vinipsmaker
Take the following two snippets:
std::thread::Thread::spawn_async(foo_async)
mioco::Coroutine::spawn_async(foo_async)
In the Thread
snippet, a new Thread
gets created to execute foo_async, the await
's inside foo_async
will park
the Thread
, and it will stay there until the IO completes. In this case, the code should behave the same as the blocking version:
fn foo_async() -> String {
let s = TcpStream::bind("example.com:8080").unwrap();
let mut buff = [0u8;1024];
let c = s.read(&mut buff).unwrap();
String::from_utf8(buff[..c].to_vec()).unwrap()
}
In the Coroutine
snippet, the await
's will park
the Coroutine
, allowing other coroutines to use the thread that freed up. Whenever the await
inside foo_async
completes, it will unpark
the coroutine. Now a few things can happen here, that are all design choices on how mioco
wakes up a coroutine. If we assume a cooperative scheduler, then, whenever the current running code in the Thread
that was backing the Coroutine
yields/pauses/completes, foo_async
will resume (assuming no other work was pending scheduling on that coroutine).
Maybe it's subtle, but abstracting a computation that will return at some later point in time (Await
), is not the same as user threads (Coroutines
).
Although in reality, a world with just Await
and no Coroutine
's is kinda boring.
On the other hand, a world with Coroutines
an no Await
is the mess we live in right now, where we have no common abstraction to describe something that will return at some point in time.
Yes, Await
and Coroutine
are related in the sense that, having one without the other is boring or an API mess. But they are not the same concept, nor one, requires the other.
We already have Coroutine
's, I'm not proposing a coroutine library, nor that saying that the std
should have a default coroutine implementation.
What I'm proposing is to abstract a computation that will return at some later point in time (Await
) and to abstract an execution Context
. The only execution Context
in the std is a Thread
, the same, very real 1:1 thread we have today.
We can have a zoo of Coroutine
implementations with cooperative scheduling, preemptive scheduling, at w.e. other flavor people want publish on crates.io.
The important thing is that any code written to be async using Await
, will run in all the async libraries, and any Await+Send
created inside some Coroutine-X
crate, can be tossed over to some Coroutine-Y
crate and it will just work.
This also allows us to write some async library backed by coroutines that will continue to make forward progress even if someone decides to call a lock on a mutex. Because that lock will freeze the Context
which might be a thread or a coroutine. Same thing applies for channels.
This is precisely the goal, to be able to interoperate between all this async libraries.
@vinipsmaker, did it make sense?
@icorderi:
Thanks for the explanation. Your proposal is much more clear now.
abstracting a computation that will return at some later point in time (
Await
), is not the same as user threads (Coroutines
).
This is what futures & promises are for. In this case, the await
is just syntax sugar for that [possibly blocking code]. There is a crate in Rust just for that: https://en.wikipedia.org/wiki/Futures_and_promises
If your proposal is not about futures & promises, could you explain the differences (aside from naming and other irrelevant differences)?
Yes,
Await
andCoroutine
are related in the sense that, having one without the other is boring or an API mess. But they are not the same concept, nor one, requires the other.
Your Await
is your Await
(i.e. a new concept). You're the authority to say what it is and what it is not. I'm trying to understand what is it and I'm trying to relate your explanation to things I know using more standardized terms than "new" concepts.
Your example:
async fn foo_async() -> impl Await<String> {
let s = await TcpStream::bind_async("example.com:8080");
let mut buff = [0u8;1024];
let c = await s.read_async(&mut buff);
String::from_utf8(buff[..c].to_vec()).unwrap()
}
There are two possibilities here:
await TcpStream::bind_async("example.com:8080")
blocks. Then we're discussing non-async abstractions and I'm on the wrong thread.await TcpStream::bind_async("example.com:8080")
suspends the function (i.e. local variables are saved and this is a coroutine no matter if you're spawning it from another thread or the same thread). Coroutines are subroutines that have two more operations, suspend and resume (and this is what I see here).Now, your code has await
(the keyword) and Await
(the "object"/class/...). await
is very related to coroutines. Please spend a little more time explaining more if I'm wrong. And I'm aware that this is just syntax sugar to Await
and Await
should be usable without await
.
If you want to pursue this route, you'll need to provide a solution for "futures islands" (futures with multiple implementations cooperating together). I'm aware of none and I'm aware of endless discussions in the Boost mailing list trying to come up with some. And it seems to me that you're trying to unify async code so threads and coroutines could have the same algorithms. Therefore I'm just more inclined to believe that you'll need to solve this problem (futures islands) if you want to follow this route.
All in all, this doesn't solve one single problem: sharing a single thread with multiple jobs/tasks/closures/... . This is the executor. We have mio, but mio alone is a mess to integrate with different crates (token clashing, only one possible implementation of Handler
running...).
and to abstract an execution
Context
I'm still unsure about what is this "execution Context
".
The important thing is that any code written to be async using
Await
Not all problems were solved. We still need to "spawn" a few things and now in which threads they'll run. I'm not convinced until these others pieces are described well enough too.
This is precisely the goal, to be able to interoperate between all this async libraries.
Agreed.
A little text of myself covering a little on the topic of asynchronous code (including the Kohlhoff's solution to interoperate multiple async paradigms): https://vinipsmaker.wordpress.com/2015/11/26/multitasking-styles-event-loops-and-asynchronous-programming/
@vinipsmaker thanks for taking the time to go back and forth on this. I appreciate the discussion and you raise very valid questions. I hope the following long post will provide you with answers.
"abstracting a computation that will return at some later point in time (Await), is not the same as user threads (Coroutines)." - @icorderi
This is what futures & promises are for. In this case, the await is just syntax sugar for that [possibly blocking code]. There is a crate in Rust just for that: https://en.wikipedia.org/wiki/Futures_and_promises
I'll have to disagree with you again. Just because it quacks, and with a nudge in the right direction it swims, it doesn't make it a duck.
A (Future
, Promise
) pair are different abstractions than an Await
+ Context
.
You can hold a Future
, check if it's completed, wait for it to complete. Once it is completed, you can access it all you want. A Promise
is used to fulfill / set / send the value to the corresponding Future
it was created alongside with.
You can think of an Await
as an asynchronous FnOnce
, once it's consumed, that's the value it produced. You can't really call it again, it doesn't cache the result for you, it doesn't notify you of completion, you can't ask it if it's done (it will just finish when it finishes).
A Future/Promise
pair is closer to a channel()
than to an Await
.
Off course, if someone gives you a Future
and you need the value now, you can let x = await the_future
.
@vinipsmaker, when I say abstraction x is not y, I don't mean that you couldn't get to y from an x. I simply mean that semantically, they are not the same concept.
Let's look at some async options, from the user's perspective.
Active
Lazy::new<T, A: Await<T>>(a: A) -> Lazy<T>
.Await
? then use a something like Future::from_async(..)
(this does not produce a Promise
, the return of the Await
is an implicit "set".)Future::new() -> Promise
.let x = await foo()
will do the trick.let (tx, rx) = channel()
Passive
Event
@vinipsmaker, I'll try to explain the Context
based on the following quote from you:
...it seems to me that you're trying to unify async code so threads and coroutines could have the same algorithms.
That sentence, particularly what I bolded out, is precisely why the Context
abstraction exists, and what the Context
abstraction is for.
Since we agree that a Thread
and a Coroutine
are not the same thing. Simply because of that, a Context
and a Coroutine
cannot be one and the same, because a Thread
is also a Context
.
Context
unifies Threads
, Coroutines
, etc. under a single abstraction.
By the way, what you state as the "future islands" problem is exactly what gets solved with the Await
, Context
abstractions.
All in all, this doesn't solve one single problem: sharing a single thread with multiple jobs/tasks/closures/...
Correct, this is not intended to solve that. This is about what basic abstractions should go into the std
to support interoperability between different async crates.
I'll maintain my position, I don't believe we need Coroutine
's in the std
. We just need an abstraction that will allow the community to create different jobs/tasks/closures/Await
schedulers. Schedulers that create Context
s to execute this things on.
Not all problems were solved. We still need to "spawn" a few things and now in which threads they'll run. I'm not convinced until these others pieces are described well enough too.
Fair enough.
As far as scheduler choices go, the scheduler choice should be done by the main()
crate.
That's the guy that is about to consume a big pile of Await
stuff and can't really defer the decision any further.
fn main_async() -> impl Await<()> {
// lots of awaits calling more awaits, spawning even more awaits
}
fn main() {
ThreadPool::with_capacity(8).run_synchronously(main_async).unwrap();
// or
GreenPool::new(8).run_synchronously(main_async).unwrap();
}
In this example, the two, ThreadPool
and GreenPool
, are very different schedulers, and the Context
's they produce are also different things.
A ThreadPool
is created with 8 worker threads. A Context
is still a very real Thread
for that guy. So it could very much run out of Context
's if all threads are waiting on something. If you queue 9 things, 1 will wait until another completes.
On the other hand, the GreenPool
is created also with 8 backing threads, but their Context
would be a GreenThread
. You can have a million Await
's running in parallel (yes, only 8 will be making progress at any given time). With cooperative scheduling, every time one pauses waiting for something, another one takes a turn.
Note: Libraries (as opposed to binaries) making a choice of a concrete scheduler should be something rare, and they should have a good reason for it (i.e. a library that will have thousands upon thousands of long running
Await
's would not be happy with theThreadPool
)
@vinipsmaker, I hope the proposed abstractions, how they interact with each other as well as how crates benefit from this to interoperate with each other start making more sense.
@vinipsmaker I think @icorderi's proposal could be implemented with a plain state machine, where each "function" maps to a state machine represented by a current state variable and a struct/tuple of state-specific variables. In this case, something that appears syntactically as a call is actually the composition of nested state machines.
I think this is a generalization of a coroutine: in a coroutine the state tuple is a current frame, the nesting is the call stack, and the current state is the program counter. But I wouldn't consider the general state machine concept a "coroutine".
@icorderi I'm curious how this might work on a type system level. E.g. (I see you already gave the definition for the Await
trait above), what do you think the body of trait Context
might look like, even if just as an example?
@glaebhoerl, I haven't been able to to play with Context
enough to answer that with certainty.
In order for this to work, Context
should have the building blocks of what's needed for coding the internals of the std::sync
types (at least the await
able versions).
The design of Context
should allow a Coroutine
implementation to never truly block, regardless of code being executed by the coroutine (or stuff the code calls) using Locks, Barriers, Receivers, SyncSenders, etc..
For now, I mirrored most of the free functions on std::thread
that operate on the current Thread
as well as the unpark()
from Thread
.
/// Execution `Context` for `Await`'ables
pub trait Context {
/// Atomically makes the handle's token available if it is not already.
fn unpark(&self);
/// When _awaited_, blocks unless or until the current context's token is made available.
/// This method returns immediatly.
fn park(&mut self);
/// When _awaited_, blocks unless or until the current context's token is made available
/// or the specified duration has been reached (may wake spuriously).
/// This method returns immediatly.
fn park_timeout(&mut self, dur: Duration);
/// When _awaited_, puts the current context to sleep for the specified amount of time.
/// This method returns immediatly.
fn sleep(&mut self, dur: Duration);
/// When _awaited_, cooperatively gives up a timeslice to the scheduler.
/// This method returns immediatly.
fn yield_now(&mut self);
}
I haven't gotten around to read all the src code from std:sync
to get a complete abstraction on my head. An initial dive on the std::sync
code shows that it's built on top of sys::common
which are thin wrappers on the OS synchronization structures.
This seem to be the base structures the std::sync
module is built on:
Those, will block the thread. Which means, the Context
abstraction needs to include them somehow.
It could be as trivial as:
pub trait Context {
fn mutex(&mut self) -> Mutex;
fn rwlock(&mut self) -> RwLock;
fn condvar(&mut self) -> Condvar;
// ...
}
Perhaps, Mutex
/etc.. constructors should be able to get the internal implementation from the Context
:
impl<T> Mutex<T> {
/// Creates a new mutex in an unlocked state ready for use.
#[stable(feature = "rust1", since = "1.0.0")]
pub fn new(t: T) -> Mutex<T> {
Mutex {
inner: box Context::current()::mutex() // StaticMutex::new(),
data: UnsafeCell::new(t),
}
}
}
A good exercise to get a tighter definition for Context
is to start a sync2
crate with Mutex
and Receiver
working with Await
. The Thread
'ed version should have no extra overhead and end up doing the same thing the current implementations do. A Coroutine
version should not ever block the underlaying Thread
.
@glaebhoerl, does this help you understand how it would work at the type system level?
@icorderi That's helpful, thanks.
One thing I suspect would be an issue with that approach is that it would imply distinguishing a closed universe of supported synchronization primitives -- the ones currently in std
. Whereas I suspect the attitude of the people involved would be "those are just the ones we chose to include in std
, anyone should be able to roll their own and put it up on crates.io", which no longer works so well if they all have to be centrally enumerated up-front in a trait. (This feels like the expression problem.)
But I'll leave that to people who are more knowledgeable about this stuff.
One thing I'd like to bring up is the C++ proposal N4453.
It defines an exceedingly low-level primitive, "resumable functions," which are sufficient to implement async/await, promises/futures, or cooperative green threading.
The proposal uses the "resumable" keyword in three places, which could be implemented independently if Rust were to take this approach:
resumable T
in function return types, like constexpr T
. This declares that the function may suspend itself for later resumption.break resumable;
, the actual suspension point.resumable T
in variable declarations. This declares that the variable holds the result/state of a resumable expression.However, all of the above are syntax sugar. What is actually required to implement the feature is an interface for resumable callables, much as Fn/FnMut/FnOnce are an interface for callables that do not suspend.
The interface would only require one method:
trait Resumable {
type Output;
fn resume(&mut self) -> Option<Output>;
}
This Resumable
trait describes the execution state of a resumable function. The resume()
method produces None
until the resumable computation is complete, at which point it produces Some
. Interestingly enough, this is a near-perfect dual for Iterator - even up to the lack of guarantees should resume()
be called again after it returns Some
.
A resumable function would then return some R: Resumable
, await
would be a trivial expansion using match
, and suspension points would be simple syntax sugar in the same way closures are. The last could be left to later (pending compiler support), while the former parts require no special support whatsoever.
We had some discussions in Orlando last year, where we agreed on state-machine generators on top of something like mio for high-efficiency network servers.
Every time I see mentions of coroutines (as in, "green threads"; Python sort of muddled the distinction by what I think is terminology misuse) or boxed callbacks in async IO discussions they make me sad because those actively impose limits on what you can actually do at any given time without running out of memory, not to mention slowdowns, although some virtual call costs wouldn't be noticeable.
cc @pcwalton who knows more about the state machines + mio approach.
@eternaleye You're describing the other half of a generator, for which a minimal API would be:
pub enum Yield<T, R> {
Value(T),
Return(R)
}
pub trait Generator {
type Value;
type Return;
fn next(&mut self) -> Yield<Self::Value, Self::Return>;
}
That is necessary to support both yield expr
and return expr
, although not enough to actually give input back to the generator, which introduces several complications.
@eddyb That's a very odd trait for Generator
, IMO. I mean, your definition is just a Result
-ified Iterator
.
Meanwhile, I see real uses for Iterator
s where Item: Resumable
- in particular, if each value is expensive to compute, and latency is a concern.
@eternaleye In a way, Yield
itself is Result
, but it makes more sense when your final return value is io::Result<R>
, where Iterator<Item=Result<(), io::Result<R>>>
seems very confusing and has three possible after-the-end behaviors (None
, Err(Ok(()))
or panic).
Before hearing @pcwalton explain how simple mio can be with generators, where you never have to pass buffers in and out, you just try to read repeatedly, and effectively yield
on "would block", I wrote this crazy contraption as a model for generators with multiple entry points based on the types they expect to get back as input.
In a comment there is this playpen link for an example using the manually desugared implementation to show off "copying a file in chunks, with progress notification".
This model hits a wall when you have some construct that will execute multiple await
requests in parallel, because it would need a way to know which one succeeded, whereas in the mio model, you just retry all of them, and at least one will succeed.
Based on what I now know, I would probably implement the mio model with helpers for non-network IO to be fair to multiple network clients, in which case, reading from a file could just somtimes result in an artificial "would block", causing different actions to be also attempted.
@eddyb: My point is that your definition of Generator
is quite different from what I see almost anywhere else; the common definition seems to be admirably captured by Iterator
.
Rather, I see Iterator
and Resumable
as duals in a very real, meaningful way:
Iterator
expands itself into multiple values by repeatedly invoking the same computationResumable
collects itself into a single computation by sequentially invoking potentially-distinct computationsA perfect example of something that maps well to Resumable
is Iterator::fold()
- each step of that maps perfectly to a call to Resumable::resume()
, with the final step producing Some
Resumable
as described wouldn't work with generators, whereas @eddyb's Yield
would.
@tikue: I'd suggest reading the linked C++ proposal more deeply - not only can resumables work with generators, the document contains an implementation of generators using resumables.
Finally gave the paper a read, and it's pretty much what I expected: a dirty value stashing trick (which @eternaleye called "extending the iterator" - confusing out of context):
fn fib(mut n: usize, out: &mut Option<i32>) -> impl Iterator<Item=()> {
let mut a = 0;
let mut b = 1;
while n > 0 {
*out = Some(a); yield (); // yield(a)
let next = a + b;
a = b;
b = next;
}
}
fn main() {
let mut val = None;
for _ in fib(5, &mut val) {
println!("{}", val.take().unwrap());
}
}
While this is uglier than actually having generators in the language (not to mention somewhat less efficient), it also poses borrowing problems: the generator object would borrow val
, making it impossible in the current-world borrow-checking to .take()
from it in the loop.
Actually solving this problem involves knowledge of the generator object and/or the Iterator::next
implementation for it, as in other cases mutating a mutably borrowed value can result in memory unsafety (iterator invalidation, really, if you replace the generator object with Vec
iterator).
@eternaleye I don't understand how Iterator
and Resumable
are supposed to be duals when their definitions are alpha-equivalent?
The actual dual to iterator is -- allegedly -- the observer pattern. (Linking the google cache because the original seems to be broken.)
So @eddyb and I talked further on IRC, and I'm going to try and summarize (if I get anything wrong, just let me know).
IMO, the optimal form of asynchronous execution is one in which:
Implicit in this framing is that case 1 is more common than case 2 - most of the time, the programmer should not have to care whether something is synchronous.
Consider the following (strawman syntax):
async fn fib() -> Yield<i64> {
let mut prev = 1i64;
yield prev;
let mut cur = 1i64;
yield cur;
while cur > 0 {
let (old, next) = (cur, prev + cur);
prev = old;
cur = next;
yield cur;
}
}
async fn fib_prime() -> Yield<i64> {
for i in fib().filter( is_prime ) {
yield i;
}
}
Note that fib()
is called as a perfectly normal function in fib_prime()
- this is due to point (1) - requiring a keyword here would clutter the majority of code.
fib_prime()
may potentially run for quite some time before it yields something; this would cause problems of responsiveness. This can be resolved by having fib_prime()
suspend even when it would not yield - specifically, if it suspends any time an async fn
it calls would suspend. Here, that's fib()
- this limits its worst-case execution between suspensions to one iteration of fib()
and one call to is_prime()
, rather than an unbounded number of the same.
Of course, there are times when one needs to explicitly drive a Generator. Here's such a case:
async fn foo() {
let x = proc fib();
for (i, v) in x.into_iter().enumerate() {
println!( "{}", i );
if i % 2 == 0 {
suspend;
}
}
This function has side effects, and needs to ensure it's not interrupted between any pair of operations. To do this, it uses the proc
keyword to denote that, rather than being interested in the value of the fib()
function (regardless of execution semantics), it wishes x
to hold "theproc
ess by which fib()
produces the values". It then converts the Generator
into an Iterator
, and processes it accordingly - suspending itself at a safe time to do so.
This satisfies (2) - the ability to control asynchrony when needed.
Finally, for (3), the tools have already been introduced:
fn baz() {
let x = fib_prime();
for i in x {
println!( "{}", i );
}
}
As a synchronous function has no suspension points, the behavior from (1) of suspending when fib_prime()
suspends would be nonsensical - thus, calling an asynchronous function from a synchronous function returns the generator, in the same manner as using the proc
keyword within an asynchronous function. This, together with the default behavior in (1), avoids the need for viral keywords.
In addition, as demonstrated in (2), generators convert cleanly to iterators. As a result, synchronous code that calls asynchronous code has essentially no syntactic overhead whatsoever.
However, in order to support this, the value returned from a Generator needs to support three possible cases, not just two (as @eddyb's suggestion had):
pub enum Yield<T> {
Suspend,
Value(T),
Done,
}
pub trait Generator {
type Value;
fn next(&mut self) -> Yield<Self::Value>;
}
Note: @eddyb's design has Return
carry a value R
, but it may not be worth it: with suspending, an async function is zero-or-more Suspend followed by exactly one Value, while an async iterator is zero or more sequences of (zero-or-more Suspend followed by zero-or-more Yield) - the return value complicates conversion to an Iterator
(if R
is not ()
) or a Fn{,Mut,Once} (if T
is not ()
))
In the case illustrated with (1), the "outer Generator
" (fib_prime()
) would drive the "inner Generator
" (fib()
) in lockstep - but when the inner Generator
produces Value
, the outer Generator
will capture it, and emit Suspend
instead. On next iteration, it will feed the captured value into its own update process.
In the case illustrated with (2), the outer Generator
is wholly unaware of the inner Generator
- the update process handles driving it entirely on its own.
In the case illustrated with (3), conversion to an Iterator
simply wraps the Generator
in a newtype, which is trivial and loops on receiving Suspend
.
May I mention -- coroutines/generators are closely related to the IO
monad, and await
syntax is just another form of do notation. See #243, in particular this comment by @Eilie.
@bugaevc I really wish someone had made a nice blog post about this, because rehashing the argument every time is tiresome and pointless: monads a poor fit for Rust's low-level and imperative semantics.
Among other things:
Option
wants FnOnce
, anything which calls the closure more than once wants FnMut
or Fn
, dependingMonad::bind
has to be generic over the closure type, but that means the implementer can't keep the closure around without boxing it (unlike, say, the map
adapter on iterators, which gets static dispatch)@dwrensha tried to explain how generators can be used in their blogpost but there's still lots of work to do to refine the details of the various approaches and then compare them.
Until someone can actually show how to implement useful monad-like abstractions in Rust that can compete with generators + ?
/catch
, I remain unconvinced about the value of that front.
@eddyb Sure, I understand monads don't fit into Rust as easy as they do into Haskell.
I really like ?
/catch
and can't wait for them to reach stable. And yeah, they were relatively easy and straightforward to implement without having to dive into this lambda / state machine nightmare.
I have some experience with async stuff in Python, so I do understand both how generators work and how I/O coroutines can be implemented on top of them.
What I want to emphasize is that implementing await/yield syntax is essentially the same as implementing the do notation (with FnOnce
s).
?
was a trivial special case that could possibly be implemented as do, but it was easy enough to be implemented the straightforward way. This time it is the general case. There is no simple way to do it without plunging into monads and do notation, and it has exactly the same set of problems.
Not really, since it's limited to FnOnce
s and that gets away with most of the problems -- but does Rust really need the totally general case? It seems to me that FnOnce
s would be enough.
Sure, the implementor can't keep the closure around without boxing it -- but neither can the I/O loop (or whatever machinery implements async I/O) keep around the implicit closure that implements trait Generator
(coroutine/generator continuation). It's one-to-one mapping, that's what I'm trying to say.
@bugaevc But a generator has N states in a single type, where monads/CPS have one closure type per state.
but neither can the I/O loop (or whatever machinery implements async I/O)
But I can allocate an array of a million instances of a single generator ahead of time, with no indirection. This is the future of massively scalable services: minimal computing overhead, maximal throughput.
@eddyb can you elaborate? I can't really picture when you'd want to run a million copies of the same service. I actually think running many different services on a single event loop is going to be the more common use case. Any service that delegates to another service for part of its work will need to do this. This is actually something I'm struggling with in tarpc, where I box services on the event loop to enable heterogeneity.
@eddyb: I don't want to sidetrack the discussion on-hand, but someone did write an excellent blog-post about it (monads and Rust) some time ago.
@krstoff That addresses only my first point and pretends that it can eschew the second (which is not true, not while keeping the monad consistently typed). And it's already pretty complex, sporting a huge feature requirement list.
First, this thread has been very, very interesting to follow, and I really hope Rust can bring a standard solution to async (or at least something for the many crates out there to play nice together). I can't help but take a stab at this stuff myself (maybe yet another crate will show up from me?).
This thread seems to have wandered around for a while, I think because everyone already has in their own mind how they want the abstraction of async to look like. Whether its callbacks, coroutines, or what have you. None of those are strictly required for async -- they are good abstractions over something that can be confusing to work with at times, but are not required for low-level interoperability.
The issue is this: we want some operation to begin, and we don't want to "block" the code we're at. There are lots of ways to solve this. Here's (another) summary:
Coroutines: Requires compiler support, and/or really, really fancy macros to break the linear flow of execution. When we either yield
or await
on something, we want to suspend the current coroutine and jump to code elsewhere. When the value is ready, we ought to jump back to where we were and resume execution.
Question for those already thinking on these: for an expression let x = await y;
, what is a valid type for y
such that it can be "awaited" on? We can do some trickery to optimize coroutines calling other coroutines (to "flatten" them), but otherwise we need some sort of representation of a future value... which leads to...
Futures/promises/awaitables (there are many names used for similar concepts): Basically like a Result<T, E>
, but something that has three states:
pub enum Awaitable<T, E> {
Pending,
Ok(T),
Err(E),
}
Of course this implementation won't work because the awaitable must be mutated from a Pending
to an Ok
or Err
once the value becomes available, but this definition is for clarity of the core idea. This concept isn't completely at odds with other paradigms, unless you need to bypass an actual Awaitable
for the sake of performance (like in yield from
).
Callbacks: Everyone seems to agree that callbacks are ugly; I do as well, and it quickly becomes a hot mess to work with and causes code to become really messy. However, callbacks have the advantage of being approach-agnostic (with a few exceptions). For example, Awaitable
s could use callbacks to handle the completion of the value (adapting a previous example:
let mut buff = [0u8;1024];
TcpStream::bind_async("example.com:8080").result(|s| {
s.read_async(&mut buff)
}).map(|c| {
String::from_utf8(buff[..c].to_vec()).unwrap()
})
Not the cleanest, but not all that different from working with a bunch of Result
s. Not being able to use pattern matching might be annoying, but we could minimize the callbacks with a signature like fn result(then: FnOnce(Result<T, E>) -> Awaitable<T, E>) -> Awaitable<T, E>;
and work with the Result
directly.
As noted by those knowledgeable in the Python world, async
/await
was introduced because of the ever so slightly different semantics from generators and to prevent muddying coroutines and generators together. We may want to keep them separate right off the bat to prevent this problem before it starts.
I think await
would be the best solution, but I'd rather take advantage of Rust's honest typing precedent and avoid the async
keyword. Instead of
async fn do_something<R: Read>(r: R) -> Result<String, ?> {
(await r.read_async(1024)).map(|buf| String::from(buf))
}
why not this?
fn do_something<R: Read>(r: R) -> Awaitable<String, ?> {
await r.read_async(1024).map(|buf| String::from(buf))
}
Food for thought.
@icorderi's proposal seems almost useless to me, unless I am missing something?
trait Await<T> {
fn await(ctx: &mut Context) -> T;
}
For this to be useful at all requires there to be some sort of hidden semantics here that is capable of suspending the execution context, in a rather preemptive manner. This signature promises no such suspension and is misleading. The only way for this to work is to have:
await()
function call for the value to arrive. Doesn't this defeat the point of trying to introduce async? Just do a regular blocking call and save the overhead.@eternaleye's trait definitions for generators seems the most realistic to me, but again, do we want to muddy generators and coroutines together? If so, that's an acceptable stance, just not one I particularly agree with.
Sources: My own experience in implementing async over the years and working hard bringing it to PHP. Sorry for the wall of text.
do we want to muddy generators and coroutines together? If so, that's an acceptable stance, just not one I particularly agree with.
It's all just control flow... what's the rationale for segregating them? (Honest question for my own edification.)
I much prefer the terminology I've seen used during ES Harmony, before ES6 chose a generator design:
yield
(i.e. passing a value into the generator step function) is an optional featureyield
-using function in a resumable call stack, which can also contain regular functionsThis makes "stackless coroutine" either an oxymoron or an interprocedular function -> generator pass.
What Python did, IIUC, was call generators "coroutines" when they also had expression yield
.
However, expression yield
has no measurable implementation costs in the stackful cases, and neither in stackless state machine transforms performed on CFGs (which is what MIR is), only AST transforms.
We can waste our time either by using misguided terminology variants or being very explicit every single time (e.g. "stackless resumable functions with inputs/outputs"), agree on less worse ones or invent completely new terms. But, you know, naming things is a hard problem and too much of a bikeshed.
I was talking to eddyb about this last night. A few things that occurred to me:
trait Cycle3<T, U, V>: FnOnce(T) -> (U, impl Cycle3<V, T, U>) { }
should work.@Ericson2314 I'm not entirely sure how cycling works, I haven't yet looked into that. However: I should point out that the complexity would be mostly at the type-level, not the state-machine transform. We can provide both session types and an iterator interface at the same time, using the same MIR-based transformation, but different type-level handling.
@eddyb you said you don't like https://gist.github.com/eddyb/822c658190ccf18058db so much anymore? What is you latest mir-transformation idea?
Rust currently only includes synchronous IO in the standard library. Should we also have async IO as well, or leave that to an external crate? What would the design look like?
Moved from https://github.com/rust-lang/rust/issues/6842
Related: https://github.com/rust-lang/rfcs/issues/388