rust-lang / wg-async

Working group dedicated to improving the foundations of Async I/O in Rust
https://rust-lang.github.io/wg-async/
Apache License 2.0
379 stars 88 forks source link

Actor-system related questions #90

Open uazu opened 3 years ago

uazu commented 3 years ago

Brief summary

Some stuff can't be written in a blocking model, even a non-blocking blocking model like async/await. By blocking I mean that when you do an async call, your coroutine blocks, and the called object is also blocked on that one piece of work until it returns asynchronously. An obvious example that doesn't fit this is a network stack layer where events come in from both below and above and also from timers. All these events have to be responded to immediately. Blocking (or logically blocking) just won't work.

Doing some kind of a "select" on the calling side solves the "only one outgoing call" problem, and the "being called blocks an object" (i.e. "only one incoming call") problem can probably be solved by having multiple proxy objects for your main object, so you don't block the main object. But this is all a very round-about way of getting the required behaviour.

So this is where the actor model comes in. I don't know whether you want to discuss the actor model in this review, but the subject keeps on coming back. As the author of Stakker crate, I am very happy to contribute to the discussion if it is of interest. Here are some subjects you might wish to cover in your review:

Different models of actor system in relation to async/await:

Impedance mismatch between async/await and actor model:

So I guess these are the questions this raises:

For example, could we make async/await suitable for actor-like tasks? The fundamental problem is that the state self is locked during the .await. If more than one coroutine could access self at the same time (i.e. interleaved at yield points) then the problem of blocking the actor queue would be solved. (If this could be done with only static checks, i.e. no runtime RefCells or whatever, so much the better.) However maybe this is just completely incompatible with the async/await model, so it is just not possible. So an external actor system is the only way to handle these kinds of problems.

For example, stuff of interest related to async/await for my own low-level actor system (Stakker):

Optional details

Tell me if you want me to write this up, i.e. whether this (or any parts of it) are subject areas of interest, and where in your framework for this review it should fit.

uazu commented 3 years ago

I think the best way I can contribute right now is try to add async/await support to Stakker, with Stakker working as an executor. (This was already planned.) Then write it up as a status quo, I guess for some fictional runtime, assuming I get enough of it done within the time limits for this review. Since Stakker was implemented before async/await stabilised, it wasn't designed around the same assumptions, so it should be a reasonable test-case.

One question I have is: How big is the executor-independent async/await ecosystem that I can expect to be able to interface to? If there was (in future) a way for crates to advertise that they support running (partially or fully) across runtimes, e.g. some tag or fixed phrase, then that would be useful.

Stuff I should look at supporting:

Anything else?

nikomatsakis commented 3 years ago

I'm trying to think how to turn this into a story -- I'd love to read more about it.

uazu commented 3 years ago

I have a couple of suggestions for stories:

I don't want to sound critical, since I've been coding with the never-blocking actor model for 10 years and it's natural to me. Sequential coding is much more familiar: just look at the relative popularity of Go vs Pony. So I can totally understand the motivation for Rust's async/await. But maybe these are some of the trade-offs for that familiarity. In the first 3-4 pages of my Stakker design notes I briefly go over how I ended up back with the actor model again, even though I was trying something different in Rust. The main thing is never-blocking, so there can't ever be a deadlock, and the borrow checker forcing shallow stacks, which means it's impossible to construct code with a reentrant call. So I guess whatever I do, I keep coming "home" to the actor model, although that isn't planned.

I don't think shallow stacks are the only way, though, but to avoid reentrant calls you need a way to defer a call to a queue. The actor model gives you the wiggle-room to do that, because all the inter-actor calls are defined as async.

Regarding having N calls outstanding at the same time from an actor (or to an actor), I'm not sure how to make that into a story. I know that it really suits some areas of application, although perhaps it's not necessary for most. So you can fire off N calls (or equivalently send N messages) and trust that you'll get the responses back at some point. (Stakker guarantees that you'll get a response even if the called actor fails and the Ret is dropped).

The alternative coroutine model which I hope to implement for Stakker, which I've called actor coroutines, also seems hard to turn into a story. I've been coding a long while using this model in Lua. This gives a sequential coding model, but for actors. The coroutine has direct access to the actor state (i.e. Self) when it runs, and can only live as long as the actor. Logically the 'resumes' are driven (behind the scenes) by "messages" received by the actor (although in reality these are just FnOnces on a queue like everything else). One significant difference to async/await is that since this is a never-blocking system, when the coroutine yields, normal actor behaviours and other actor coroutines for the same actor may run. So the coroutine has to give up their &mut Self reference on each yield. Some more notes here. (This is giving me some difficulties to implement on top of async/await, since I want to avoid dynamic checks.)

The other story which I hope to contribute to is:

This is pretty well-defined already. Someone at some point has to decide that some of these interfaces are mature enough and well-enough tested to pull into the standard library. I don't have a high-enough perspective to judge that, but perhaps I can give some more data. I'm looking forward to getting into the detail of this.

nikomatsakis commented 3 years ago

Something I've been thinking over -- that seems to be a latent theme in a few stories -- is "environmental state". That is, having access to some shared resources which are "released to the wild" during an await.

If we make this an &mut parameter in async await today, those resources get captured by the future, which isn't really what people want. You can do an Arc but then you need mutexes or ref-cells and that's not especially ergonomic. You really don't want the ref-cell to be locked over an await, either.

I have to go digging, I feel like I've seen echoes of this theme in a few places.

It seems relevant to actors because I imagine the actor's state itself kind of fits this.

uazu commented 3 years ago

As I understand it, the requirement has been captured for generators in this issue.

uazu commented 3 years ago

I've documented the first part of implementing an executor on top of my actor runtime: https://uazu.github.io/blog/20210406.html

Maybe the different perspective might be interesting. I consider using GhostCell with futures. I don't have any conclusions yet as I've only done the basic stuff so far.

matklad commented 3 years ago

I want to add that “some stuff can't be written in a blocking model, even a non-blocking blocking model like async/await.” is a somewhat profound fact, which isn’t really widely known. For example, only this year I was able to put my finger on a specific problem:

https://matklad.github.io/2021/04/26/concurrent-expression-problem.html

(Stakker design notes were instrumental to my understanding, thanks!)

uazu commented 3 years ago

For some reason people want to write Go in Rust, so they're just going to have to learn the hard way all the places where that's not a good idea. Yes, it is convenient for a certain class of problem. For other problems they are going to find it very hard to find a clean solution, and perhaps wonder why. Layering improvised actors on top of channels and an async/await runtime seems a very poor workaround to me, with its own unique problems that a low-level actor model system doesn't have. I can't be blogging all the time to communicate this, although I might try again at some point.

I haven't taken the async/await work for Stakker any further yet because I have another unrelated open-source crate that I'm trying to complete and get out the door in spare moments.