[Discussion] Add a way wait/block the main thread on `ActorSystem`

shekohex commented 5 years ago

Currently there is no clear way how to wait for the actor system to complete as the main thread would exit, since creation of the actor system dose not block.

After discussing this issue with @leenozara and @riemass and others we got these proposals:

1 - one way to add a method to the ActorSystem like .wait() that returns a rx from a oneshot channel (acts like a future) and then using block_on(sys.wait()) to wait until this future resolves, and it will only resolve on system shutdown event.

2 - similar to the previous proposal but @leenozara introduced the when_terminated() instead like in the akka's when_terminated and it would return the Terminated (type alias for Receiver<()> for now), and I started to build on this idea by adding another feature that would also enable to get notified in another places on the application when the system is terminated, the idea also is inspired by akka ActorSystem, adding a sys.register_on_termination(fn) by register a callback to run after system is terminated, and the termination event/message has been issued and all actors in this actor system have been stopped. that also would be possible to add multiply callbacks by calling this method multiple times. Note that ActorSystem will not terminate until all the registered callbacks are finished. (which would be a good place to add a tx to send you a notification when the system is terminated).

3- added by @leenozara and @riemass too, We have channels that are PubSub and we have ask that is a tmp actor bridging the actor space and main thread space as a future. One idea could be to combine those two, for example using .when_terminated() creates an ask that subscribes to some SystemTerminatedChannel. When shutdown happens all the subscribed temporaty ask actors fulfill their futures.

Any other Ideas/proposals ?

leenozara commented 5 years ago

Hi @shekohex thanks for writing this up. I would advocate for starting small to solve the immediate problem which seems to be that in smaller projects where there's just Riker being used and no other services it would be convenient to block the main thread using the actor system.

The first option looks like it would work well. sys.when_terminated() could return a oneshot::Receiver<()>, maybe also use a type alias Terminated. The only drawback with this is that it limits use to one place since there's only a single Receiver. However this could be addressed easily in a next iteration and without any changes to the API.

I prefer this approach since it's a simple start and allows us to improve without any API changes. What are your thoughts? If we took this approach how would we implement? Maybe when the system starts it creates the Sender/Receiver pair and store as Options on the struct. Then when when_terminated() is used you do a self.shutdown_rx.take(). Something like that?

shekohex commented 5 years ago

I agree this proposal is the simplest and easiest to implement.

The only drawback with this is that it limits use to one place since there's only a single Receiver.

yeah, but I think that could also be solved by the sys.register_on_termination(fn) but as you mentioned we can in the next iteration improve this by using the spmc pattern instead of spsc.

how would we implement? Maybe when the system starts it creates the Sender/Receiver pair and store as Options on the struct. Then when when_terminated() is used you do a self.shutdown_rx.take(). Something like that?

yeah that how I would implement it.

by the way, I got another idea, that we can offer low-level access where the user can provide some (Sender) or emitter that we can fire when the system is terminated, that way the user is free to use whatever he wants. maybe we need to introduce the Terminator trait with terminate method that returns a Future that complete when the system is going to be terminated :smile: :+1:

riemass commented 5 years ago

The receiver being one shot could actually be a benefit. Keep simple things simple, at least for now. How about implementing Drop for ActorSystem so that it awaits the full system shutdown. The public api would allow for detaching the threads, simmilar how a JoinHandle works for threads. The use could:

fully detach the system, kind of what now happens,
block until the system shutdown is called somewhere in the system,
wait on the receiver Future/Channel and implement his own shutdown behavior.

anacrolix commented 4 years ago

Something like 2 in https://gitter.im/riker-rs/Lobby?at=5efb9669405be935cdd12e47.

2) Wait until the user root has no child actors left ... make sure that every actor stops once it has finished it's work and wait for the child count of the root actor to reach 0. But for this to work you have to use Strategy::Stop as supervision strategy and all direct children of the user root must live until your program has finished all it's work.

I'm not sure how the specifics of that work, but for my use, I'm trying to emulate Pony's exiting when there are no more messages remaining to be sent in the system (i.e. nothing can happen).

hardliner66 commented 4 years ago

@anacrolix Section 2.6.7 in the paper A String of Ponies describes how Pony terminates. In this paper it is mentioned, that pony has language level support to know if an actor will send further messages. Pony can do this, because it is a language and the compiler can help determine how the system should behave. Also, in Pony there is no multithreading primitive, other than Actors.

Riker on the other hand is a library and depending on the use case you might not want to terminate as soon as all messages are delivered.

That being said, I still think it could be a good idea to signal to the user when none of the actors are currently running and if there are no messages in any mailbox except maybe deadletter. The con of this method is, that we need to track how many actors are currently running and how many messages are yet to be delivered.

P.S.: The method you quoted will not be an emulation of Pony's behaviour. With this method riker shuts down as soon as the all the direct children of the user root are terminated. Depending on your architecture, that might not be what you want.

This method works best if actor termination follows a lifo order. This means if your spawn order is: A -> B -> C -> D Your shutdown order needs to be D -> C -> B -> A

riker-rs / riker

[Discussion] Add a way wait/block the main thread on `ActorSystem` #58