rkuhn / acto

Actor library for Rust
25 stars 2 forks source link

core design question #1

Open rkuhn opened 3 years ago

rkuhn commented 3 years ago

My goal with this repo is to experiment with actor formulations in the specific context of Rust’s language features in order to distill the essence of an actor API into such a minimal form that it may eventually become a standard that many executors/dispatchers/mailboxes implement. This may be regarded as naïve or arrogant, please see it only as an experiment with a decent failure probability.

Prior Art

So far I looked at

All these have in common that an actor

Exploring a new part of the design space

Over the years my thinking may have been influenced by reading too many session types papers, as these are often based on channels and processes, using the π-calculus or some variant thereof. I’m asking myself whether a combination may not be fruitful — as far as I know there are no practically relevant theoretically proven properties of the actor model that we could accidentally violate by such a union.

So how about:

Facilitating this requires us to give the actor access to its mailbox. This access can be used to convey awareness of its self-reference (for inclusion in sent messages) or its execution mechanism (for potentially spawning new actors onto the same set of resources). Rust’s type system is flexible enough to make these properties optional in the raw Actor API, an Actor could declare the need to perform such actions by requiring the Mailbox to implement additional traits.

In this view, an actor is nothing more than a stateful async function on top of a queue. An alternative formulation would be as an async handler function, which would even allow us to remove access to the mailbox (reintroducing a Context parameter) but which would require all state to be passed from one closure to the next — or async traits.


Before diving into API design, the above question should be explored in some detail. @huntc @huitseeker, I would be most interested in your opinions!

huntc commented 3 years ago

Thanks for including me in this conversation!

On this point:

an actor pulls the next message from its mailbox — which may be done selectively, based on mailbox implementation

I’m unsure why this would be useful. Can you please elaborate? Thanks.

huntc commented 3 years ago

On this:

In this view, an actor is nothing more than a stateful async function on top of a queue. An alternative formulation would be as an async handler function, which would even allow us to remove access to the mailbox (reintroducing a Context parameter) but which would require all state to be passed from one closure to the next — or async traits.

For reference, this is essentially what I’ve done with Stage. I’m using an FnMut for my closure so state isn’t passed from one closure to another.

rkuhn commented 3 years ago

Right, at that point it becomes a somewhat arbitrary syntactic choice whether the loop { ... } should be explicit within the actor or implied by the execution mechanism. The interplay of borrow checker, FnMut, and async/await may be a deciding factor, I’d want the API as well as the compiler errors to be as intuitive as possible.

If you asked about “selectively”: This was just a thought, exploring how much freedom we can afford a Mailbox implementation — PriorityMailbox is still somewhat frequently asked for in Akka. Erlang-style selective receive should become unnecessary — and better handled without message leaks — when the actor can basically suspend its “main mailbox” while waiting for some Future to return.

huitseeker commented 3 years ago

Thanks for starting this discussion !

Bathtor commented 3 years ago

@rkuhn I mean, if you are looking for a very minimalist model that gives you nice typing guarantees and a kind of channel-focused few, I think Reactors are still a pretty sweet model. They give you the option of a pull-based approach (as you said, it's just a question of explicit loop or not, essentially), they give you typing, composition, channel selection/suspension, different channel implementations, and you can pass channels into other functions to change behaviour, for example (think finite state-machines).

I suppose my only concern with exposing this much of the underlying functionality to your users is that for every power to change the execution flow you give them, you are removing your own powers to abstract away complex interactions. For example, Kompact can do the kind of blocking future you mentioned above where you suspend messaging queues while waiting for the result to be available. But to make that happen in a consistent and reliable way it's relying very tightly on the lifecycle model which is tied into the code that decides which channel to handle from next and when to reschedule a component, etc. It's a lot of tricky, concurrency- and performance-sensitive code there, that a more exposed model would essentially expect its users to write. So we are probably talking about a very narrow target group for this kind of API.

rkuhn commented 3 years ago

Great point(er)s!

Fowler’s Mixed Metaphors

Yes, I was aware of this paper and now read it again, it is a deep and comprehensive description. My takeaway is that whole-system typing and guarantees are possible, it’s “just work”, which is a fine result! On the other hand, this is not what I am after. I find the actor model overwhelmingly useful for certain cases, but I don’t expect to find a completely actor-based environment into which I can embed this part of my program — my target environment has many different pre-existing libraries and components (like the Rust ecosystem) which are built using a variety of communication and synchronisation primitives.

The language & standard library offer one core set of features that benefit many of these implementations: the ability to efficiently wait for something (where efficiency is geared towards considering the per-OS-thread overhead to be significant).

One other observation (unless I skimmed too quickly) is that the whole-system view forces selective receive, and in order to avoid or mitigate type pollution we need sum types — which sadly are absent from Rust. What I have in mind doesn’t run into the type pollution problem because I view the actor as a unit of encapsulation for everything inside, including how it uses other components, be that sync or async, channel or future or what have you. This is consistent with my strengthened desire for local autonomy — an actor should be free to use whatever means it wants to compute its next behaviour.

Kompact / @Bathtor’s thesis

A component is an effectively single-threaded piece of logic that has an array of communication primitives at its disposal. What I am missing in that model is the ability to “stop the world until I have done this thing”. Formulated differently, I have seen that writing a single simple state machine is something programmers can grasp (if they are willing!), but state machines don’t naturally compose very well, so a state machine model that forces multiple machines to live inside one shared container and code module can easily reach overwhelming cognitive complexity. It also is usually very hard to test. My (unfounded) opinion on this topic is that simple state machines are an awesome tool, but capturing fine-grained data across multiple process steps quickly gets out of hand. The types also don’t solve this issue because they get even more complex; in contrast, my opinion on types is that they need to be significantly simpler than the value-level implementation to be useful.

Update after the above: So how does Kompact do this? I have only skimmed your thesis, so if this is in there a page number would be most helpful! As a user I might expect to be able to use communication over ports in order to compute an actor response, but how would I even describe this in plain English? “This particular interaction expects this and that response, please give me those and hold off on delivering anything else”? I could imagine doing that by modelling all communication paths in the infrastructure and then offering awaitable queries on their internal receive buffers. Messages would need correlation IDs, which probably need to be known to user code so that interactions are not unnecessarily restricted in their structure (e.g. with non-Kompact channel-based libraries).


To spell out a bit more of what I envision, I assume that there are pre-existing, established, well-working components and libraries that an actor may want to work with. The unifying interface is Future and async/await. I assume that all such “external” interactions can be implemented by just awaiting the right Future, which seems like a desirable interface also from a typed functional programming perspective, hiding every detail of the asynchronous function’s implementation. Actors participate in this ecosystem by way of the ask pattern, which means sending them an awaitable one-shot channel.

Another thought regards the type pollution problem: with all “internal” interactions safely tucked away the only remaining pollution arises from multiplexing different services onto a single “business end”, which may be convenient at times. This issue can be solved with sum types in general; in Rust it requires tailored composition in the form of an enum and the ability to contra-map an ActorRef.


I need to take a break here, will look into Ray and Riker later.

Bathtor commented 3 years ago

A component is an effectively single-threaded piece of logic that has an array of communication primitives at its disposal. What I am missing in that model is the ability to “stop the world until I have done this thing”. Formulated differently, I have seen that writing a single simple state machine is something programmers can grasp (if they are willing!), but state machines don’t naturally compose very well, so a state machine model that forces multiple machines to live inside one shared container and code module can easily reach overwhelming cognitive complexity. It also is usually very hard to test. My (unfounded) opinion on this topic is that simple state machines are an awesome tool, but capturing fine-grained data across multiple process steps quickly gets out of hand. The types also don’t solve this issue because they get even more complex; in contrast, my opinion on types is that they need to be significantly simpler than the value-level implementation to be useful.

Update after the above: So how does Kompact do this? I have only skimmed your thesis, so if this is in there a page number would be most helpful! As a user I might expect to be able to use communication over ports in order to compute an actor response, but how would I even describe this in plain English? “This particular interaction expects this and that response, please give me those and hold off on delivering anything else”? I could imagine doing that by modelling all communication paths in the infrastructure and then offering awaitable queries on their internal receive buffers. Messages would need correlation IDs, which probably need to be known to user code so that interactions are not unnecessarily restricted in their structure (e.g. with non-Kompact channel-based libraries).

Yes, I absolutely agree with what you just said. I simply didn't get around to adding "stop and wait until I have done X" before I finished my PhD. But it was literally the first thing I picked up last year once I started working as a Research Engineer. I've always found myself a very heavy-handed user of ask+future chains in Akka over the years and really wanted to bring this experience, just better, to Kompact.

This will feature in a paper I'm currently working on, but for now the best description I have is in the Kompact tutorial.

So basically I wanted to offer two ways of working with this:

  1. Handle futures as if they were just a way of handling messages with some additional context. This basically means that normal message handling continues in parallel, but whenever you make progress on a future, you have all the state from the last invocation available and you are guaranteed exclusive, mutable access to the parent component's internal state. This latter thing was something I really felt needed fixing from Akka, where future onComplete handlers from ask are run on their own actor and accidentally closing over the internal state of an Akka actor is a concurrency bug. The way I implemented this was basically just translate the future's wake handling into an actual internal message on the actor's internal control queue and whenever a message arrive there, just execute the future instead of a normal message handler.
  2. Stop executing everything until a future is completed, essentially a block_on. While this is awful for performance sensitive code, you are absolutely right that interleaving a bunch of parallel state machines with some shared state within a component can become incredibly unwieldy. This "blocking" interaction allows programmers to write code that basically looks and behaves as if its sequential, without doing crazy stuff like blocking actual threads from doing other work in the mean time. This one I implemented by means of a new special lifecycle state called BLOCKING, which is entered when a future f is being "blocked on". At this point f is stored in an internal field and no other handler will be executed until it is completed and the component leaves the BLOCKING state. In this case wake events don't need to send any extra messages, since it's already clear which future is to be executed, so they will simply schedule the component to be executed.

Now you can see this whole thing is a bit tacked onto Kompact's primary event+message APIs, so the performance of this isn't quite as stellar as the core API's. But it's still working pretty well and allows you to write some really easy to understand code for certain use cases. I think if we were to design a programming model with this kind of interaction in mind first and foremost, performance could be a lot better by tailoring the design around this. For example, if everything is an asynchronous queue (I think Rust calls these Stream these days) internally, then non-blocking vs. blocking futures interaction is really no different from normal message queues vs priority message queues with FSM-like states, like I mentioned above, where you could simply ignore certain queues for a while until you leave that state.

rkuhn commented 3 years ago

Riker looks to be a rather close translation of Akka to Rust, interesting!


Thanks for elaborating, this is very helpful! I still have a few pieces to investigate, but so far my current picture is that an actor — defined as a named process with a mailbox — can be run by some runtimes with varying focus and performance tradeoffs, some with remote (network) messaging and some without. Kompact is an example of an implementation that includes another set of features (the component model), with a choice of ways in which the actor nature interacts with the component nature.

Taking a step back: Future + async/await is the standard Rust way of making something awaitable without blocking the current thread. If there were a similarly general incarnation of the actor model, what would it be? It would need to be as precise and minimalistic as Future, meaning an actor can

I left out “send messages” because that is not a privilege only actors enjoy — we just need to have an ActorRef handle for that purpose. Being “named” does not require an explicit name property, it just means having an identity that is tied to its ActorRef.


So what is the scenario that I’d like to solve?

Let’s say I am in the situation that I know I want to use actors to implement something. So I write an actor (however that shall look) and then I need to arrange for that actor to be evaluated by some runtime support — for example using Kompact, Riker, or actix, depending on whatever is already being used in my project or team.

One core purpose of using an actor is to encapsulate some state, which can be done in three ways:

  1. make a struct and implement some trait to turn it into an actor
  2. pass the state as an argument to an async fn, which is the actor
  3. close over the state in a lambda, which is the actor

In contrast to doing this on the JVM, Rust’s aliasing control and auto traits make this safe in either case. Option 1 has the advantage that the type of the actor can be written down without resorting to generics — this can be very helpful and should be supported. Option 3 would support writing procedural actors like the following from Simon Fowler’s paper:

image

This would require an explicit receive operation to which the closure needs to have access, presumably via one of its arguments. I note this use-case because actors are not always a loop, a lot of processes terminate after performing a given sequence of steps, and Rust’s async/await syntax makes it rather straight-forward to write this down in the obvious fashion and have the compiler worry about constructing the corresponding state machine under the hood. An actor will also benefit greatly from having access to its self reference, and it will need to be able to create other actors. The latter is particularly interesting because with Rust’s aliasing control it is actually safe and nice to delegate the completion of an asynchronous process to a newly created short-lived actor — instead of forcing this completion’s state to live among its parent actor’s state machines.

Option 2 would be the async receive function that is called by the surrounding framework whenever there is more work to be done, with the ability to ferry some state from one invocation to the next. This is probably isomorphic to option 1.


At this point, my conclusion is that it may be beneficial to design an actor representation that allows the above usage while keeping all execution concerns out of this part of the API. If an actor creates a thread pool onto which it deliberately spawns new actors then that is fine, but an actor should be able to say “I don’t care where this runs, just do it.”

What is not yet fully clear to me is whether the struct + trait approach should also control the receive invocation, or whether it should be limited to “loop actors”.

In any case, the actor code will need to have access to some Context structure that at least provides the self reference and the ability to spawn more actors. I think I’ll need to play around with this to get a better intuition for the nooks and crannies — do you (everyone) think that there is a chance that this will result in something that can be used to “write an actor and run it on X”? For Kompact (or any other library) it would mean adding a function that turns an actor into an ActorRef — which is the perfect place to customise message queues and execution mechanisms.

huntc commented 3 years ago

Taking a step back: Future + async/await is the standard Rust way of making something awaitable without blocking the current thread. If there were a similarly general incarnation of the actor model, what would it be? It would need to be as precise and minimalistic as Future, meaning an actor can

  • receive messages
  • create actors
  • manage its own state

My recommendation here is to stick with Carl Hewitt's definition of the actor model which I believe is reasonably represented on Wikipedia:

Sticking to the above definition should help avoid discussion on what an actor model's scope is. It is also unsurprisingly close to what you stated. :-)

I left out “send messages” because that is not a privilege only actors enjoy — we just need to have an ActorRef handle for that purpose. Being “named” does not require an explicit name property, it just means having an identity that is tied to its ActorRef.

I think it is still useful to include it in the actor's scope declaration so that we explicitly state its capabilities. We also don't want actors to be prevented from sending messages.

So what is the scenario that I’d like to solve?

Let’s say I am in the situation that I know I want to use actors to implement something. So I write an actor (however that shall look) and then I need to arrange for that actor to be evaluated by some runtime support — for example using Kompact, Riker, or actix, depending on whatever is already being used in my project or team.

This is also my goal with Stage. I want to be able to express actors so that they are able to run on any of the popular async runtimes. Note that although this has always been the goal with Stage, you've now caused me to go back and emphasise this goal in Stage's README. Thanks! :-)

One core purpose of using an actor is to encapsulate some state, which can be done in three ways:

  1. make a struct and implement some trait to turn it into an actor I think this is quite intuitive to the Rust programmer. Most actor implementations for Rust seem to take this approach also.

  2. pass the state as an argument to an async fn, which is the actor

  3. close over the state in a lambda, which is the actor

Either 2 or 3 seem fine. I'll go for the one that achieves the goal of writing less code. I think that's presently 3, and that's what I've provided with Stage.

At this point, my conclusion is that it may be beneficial to design an actor representation that allows the above usage while keeping all execution concerns out of this part of the API. If an actor creates a thread pool onto which it deliberately spawns new actors then that is fine, but an actor should be able to say “I don’t care where this runs, just do it.”

I agree. By way of structure, I have stage_core with the primary types and traits, and then other crates for explicit runtime support e.g. stage_dispatch_tokio. To this end, I introduced a Dispatcher trait to express the runtime concerns. That should be familiar to you. :-)

In any case, the actor code will need to have access to some Context structure that at least provides the self reference and the ability to spawn more actors. I think I’ll need to play around with this to get a better intuition for the nooks and crannies — do you (everyone) think that there is a chance that this will result in something that can be used to “write an actor and run it on X”? For Kompact (or any other library) it would mean adding a function that turns an actor into an ActorRef — which is the perfect place to customise message queues and execution mechanisms.

I think there's a chance we can arrive at a common abstraction for actors on Rust. Looking at the ecosystem to-date, I can already see a great deal of commonality. As a developer, and what caused me to start experimenting with Stage, I may be working with a nostd project one day, a tokio based one on another day, and then an async-std one. Having the ability to express actors that can run anywhere should be useful to me. I'm on the fence as to whether these actor expressions need to be part of Rust's core. I think that's a stretch goal. However, it sure is great to be having this discussion here! Thanks again.

rkuhn commented 3 years ago

It turns out that with my Rust knowledge it is quite a thorny issue to implement a generic actor factory: #2 Improvement suggestions very welcome!

Such library abstractions would really benefit from higher-rank lifetime bounds or higher-kinded types. With the code in that PR I wouldn’t dare pushing towards inclusion in std (yes, that’s really a stretch goal — but it is a useful north star), since the types to be implemented by library authors are just too daunting (yes, my example does work nonetheless).

@huntc I implemented API option 3 mostly, since it includes option 1: just close over some more structured state object.

rkuhn commented 3 years ago

Update: see https://github.com/Actyx/acto/pull/2#issuecomment-846605725 for a new idea with types that are a lot simpler, and using macro rules to hide some boilerplate code (so the user won’t have to spell out all the async move gymnastics and return Ok(()) to allow ? to be used inside the actor).