Akka Typed: How much Breakage is Desired?

rkuhn commented 8 years ago

akka-actor has accrued many features over the past seven years, and in many ways akka-typed offers the opportunity to start from the ground up: different package names and APIs make a direct automatic translation undesirable anyway, so adaptations and thinking will be required of the users. We should therefore aim to create exactly what we want future users to see, unencumbered by what past and current users have seen so far—akka-actor will not go away anytime soon, so not much harm is done.

With that said, here is a list of features that I think deserve critical justification (and absent that, removal):

reflective creation of mailboxes and dispatchers: there should be just 2 implementations each (bounded/unbounded and CPU/IO)
Routers: mixing concurrent code in the send path with an Actor identity is so troublesome that I don’t think it pulls its weight—we should only offer actor-based routing and non-actor-backed ActorRefs that handle their own thread-safety
deployment scope: in particular its only use—remote deployment—should go away and be replaced by normal messages that ask a remote service to create an actor
generic scheduling of Runnables (i.e. retain only deferred message sending, offered on the ActorContext)
Extensions: they should be Actors that are created under /system and accessible via ActorRefs—with their creation being async
Failure signals: including the Throwable means asking the supervisor to know more about its children than is healthy—a failure is by definition unforeseen and should be acted upon without knowledge of the cause (this is inspired from Ponylang); this change would allow us to offer comprehensive built-in tools for failure handling because only the timely aspects are relevant (i.e. if and when to restart)

Please @akka/akka-team @MichaelPNash and everyone else: let us settle on a plan and then execute it.

This ticket is about the scope only, the approach shall be discussed in the next ticket.

rkuhn commented 8 years ago

Oh, and I should have linked to @knutwalker’s Typed Actors—it would be interesting to explore this from that angle as well.

rkuhn commented 7 years ago

So, any opinions?

johanandren commented 7 years ago

Spontaneous reaction: I agree with the first 5 bullets, the last one not so much, I understand the reasoning but it has so many practical applications in the cases where you are ok with tight coupling of specific supervisors and their children. We could still provide generic built in tools for handling failures (anything NonFatal for example).

patriknw commented 7 years ago

reflective creation of mailboxes and dispatchers: there should be just 2 implementations each (bounded/unbounded and CPU/IO)

The old approach is heavily over-engineered. Especially mailbox selection is crazy. Some love it because of its configuration flexibility, but I don't think it's worth it. We have said that dispatchers and mailboxes must be defined in configuration because of serialization when doing remote deployment, but with that argument goes away when not supporting remote deployment.

I agree that we should simplify.

Routers: mixing concurrent code in the send path with an Actor identity is so troublesome that I don’t think it pulls its weight—we should only offer actor-based routing and non-actor-backed ActorRefs that handle their own thread-safety

I agree

deployment scope: in particular its only use—remote deployment—should go away and be replaced by normal messages that ask a remote service to create an actor

I totally agree

generic scheduling of Runnables (i.e. retain only deferred message sending, offered on the ActorContext)

I agree. I think we should provide scheduling facilities similar to FSM and GraphStage as add-on.

Extensions: they should be Actors that are created under /system and accessible via ActorRefs—with their creation being async

I'm not convinced that this will work out. We have used extensions successfully in many places, where API based on async message passing would be inconvenient.

Failure signals: including the Throwable means asking the supervisor to know more about its children than is healthy—a failure is by definition unforeseen and should be acted upon without knowledge of the cause (this is inspired from Ponylang); this change would allow us to offer comprehensive built-in tools for failure handling because only the timely aspects are relevant (i.e. if and when to restart)

I'm not sure. Is the proposal to let the actor itself handle all known exceptions? Would it also have the possibility to restart itself?

It's pretty clear that a failed input validation should not involve supervision, but what about an actor that may throw 1) DatabaseConnectionBrokenException (typically handled as a restart, potentially with a backoff) 2) unforeseen NPE (typically handled as stop)

drewhk commented 7 years ago

reflective creation of mailboxes and dispatchers: there should be just 2 implementations each (bounded/unbounded and CPU/IO)

Well, I am not sure one size fits all (i.e. streams might be served better with something different than most actors), but I agree that reflective creation should go. I only want to keep the door open for adding more options, but strictly from our side (i.e. no external implementations).

Routers: mixing concurrent code in the send path with an Actor identity is so troublesome that I don’t think it pulls its weight—we should only offer actor-based routing and non-actor-backed ActorRefs that handle their own thread-safety

+1

deployment scope: in particular its only use—remote deployment—should go away and be replaced by normal messages that ask a remote service to create an actor

+100

generic scheduling of Runnables (i.e. retain only deferred message sending, offered on the ActorContext)

I am neutral here. I don't see that much offense with Runnables, but I can live without them, too.

Extensions: they should be Actors that are created under /system and accessible via ActorRefs—with their creation being async

I am not sure about this. What is the gain here?

Failure signals: including the Throwable means asking the supervisor to know more about its children than is healthy—a failure is by definition unforeseen and should be acted upon without knowledge of the cause (this is inspired from Ponylang); this change would allow us to offer comprehensive built-in tools for failure handling because only the timely aspects are relevant (i.e. if and when to restart)

If I understood correctly, this change will basically make supervision a more fine-grained tool instead:

composable behaviors for local failure handling (no internal support is necessary for this)
supervision for the cases which are unforseen and needs to wipe out all state and do a clean restart (or pull the plug and stop)

It's pretty clear that a failed input validation should not involve supervision, but what about an actor that may throw 1) DatabaseConnectionBrokenException (typically handled as a restart, potentially with a backoff) 2) unforeseen NPE (typically handled as stop)

If I understood @rkuhn properly, then 1) will be handled by the much richer behavior support. The current supervision scheme is too limiting when it comes to backoff, failover or more "educated" failure scenarios. Especially wiping out all state can be too limiting (see streams supervision and backpressure state for example). OTOH 2) would cover the last resort option.

I am not sure here, so please @rkuhn elaborate :)

rkuhn commented 7 years ago

Okay, summing up: the first five points seem rather uncontentious, with reservations on Endre’s part concerning the extensibility of message queues (I’d love to hakk on that with you sometime—when are you in Germany?) and Patrik’s part concerning the asynchronous extensions mechanism (which I have not yet worked on at all).

Concerning the last point, this was meant to be somewhat provocative (and nevertheless it took a bit of prodding to get the discussion going ;-) ). What I have implemented in akka/akka#21128 is that failure is signaled with the Terminated message, but in a secondary argument list, transmitting either a Throwable or null. Failures are only ever sent to the parent, other watchers won’t see them. Making this a binary value wouldn’t change much in terms of internal complexity, so we might as well keep it this way.

It is important, though, that we keep validation error and the likes out of the supervision strategy, and that is achieved by not offering the ability to restart from within the supervisor any longer. This, BTW, shaved off 90% of the complexity of ActorCell because the suspend/resume mechanics are no longer needed.

In terms of Patrik’s two example cases, scenario 1 would indeed be handled by the Restarter behavior within the child actor, whereas scenario 2 would be an example of failures that are not handled by the Restarter and thereby kill the actor.

patriknw commented 7 years ago

I might need to see an example of how to use the Restarter behavior to be convinced. I would like to see something better than sprinking try-catch all over the place.

Restarting while maintaining stable ActorRef is one of the core features of Akka parental supervision. If we remove that feature what is really left of the supervision feature?

My +1 to extensibility of message queues, but not as crazy as in old. Perhaps setting the implementation in the Props would be sufficient?

drewhk commented 7 years ago

I might need to see an example of how to use the Restarter behavior to be convinced. I would like to see something better than sprinking try-catch all over the place.

An example would be nice indeed. As far as I imagine though, it could look rather similar to how it works now, but instead of defining a strategy at the parent, you would wrap the behavior to get a supervised behavior. An example from @rkuhn would help though.

Restarting while maintaining stable ActorRef is one of the core features of Akka parental supervision. If we remove that feature what is really left of the supervision feature?

This will stay as it is. The only difference is that the supervision logic would move from the parent right into the child actor. This removes the need for the complex parent-child protocol with suspension and resume. It also opens up the possibility for more "educated" composable failure behaviors, like circuit-breaker like things, soft and hard restarts, etc.

My +1 to extensibility of message queues, but not as crazy as in old. Perhaps setting the implementation in the Props would be sufficient?

Well, all I wanted here is just to keep the door open for us for future improvements. But yes, in general we should get rid of the unnecessary complexity that we have right now.

rkuhn commented 7 years ago

I have updated the PR to include a very simple Restarter (which is powerful enough to support all previously existing test cases), see the test and the implementation.

Concerning the extensions proposal I spent some thought as well, and my current understanding is that meaningful operations will most likely be asynchronous by their nature, e.g. initiating cluster actions or disseminating CRDT updates. Therefore it makes sense to expose a typed ActorRef interface for them, this should not be limiting. What would be slightly inconvenient, though, is to deal with asynchronously created ActorRefs (because there is no synchronous actorOf anymore, of course). This can be tackled in two ways:

add the capability of “flattening” a Future[ActorRef[T]] into an ActorRef[T]
provide and recommend suitable child actor behaviors that do the same thing

I tried out the former approach, see this implementation. The latter can be implemented trivially such that usage becomes:

val fwd = ctx.spawnAnonymous(forwarder[MyCommand](serviceLocator, GetReference(MyCommandKey, _)))
fwd ! MyCommand("do it!") // will be queued until ref is known

I have no strong preference, but I think either solution is suitable to allow extensions to be expressed as typed actors.

patriknw commented 7 years ago

@rkuhn Example and code says more than 1000 words. I'm convinced. It achieves what I'm looking for, which is separated error handling from the business behavior. Where it lives in runtime is irrelevant.

“flattening” a Future[ActorRef[T]] into an ActorRef[T] sounds like a very useful thing, in general.

Regarding extensions, how would you implement for example the Serialization extension with methods such as def serialize(o: AnyRef): Try[Array[Byte]]?

rkuhn commented 7 years ago

To me an extension is an active piece that extends the functionality of the ActorSystem. If serialization is regarded as a service that needs management in terms of resource consumption or state, then it should be an extension (and be asynchronous etc. to allow these aspects to be managed). If it is just a function that transforms X into Y, driven by some configuration, then a Serializer should be something that can easily and locally be created from a piece of config.

patriknw commented 7 years ago

ok, but one of the main usages of extensions is "singleton instance per actor system", and easy access to it from anywhere you have an actor system. E.g. Serialization has important "caching". We can leave current extensions for this purpose and let the ones that fall into your classification be represented as typed actors.

rkuhn commented 7 years ago

Is the scope for this “important caching” really the ActorSystem? Why is it not the remote transport or the JVM or the classloader?

patriknw commented 7 years ago

I don't think I have to explain why it's not jvm or classloader.

serialization can be accessed from anywhere, remoting, persistence, also user code like custom serializers, it's simply shared within the scope of the actor system

drewhk commented 7 years ago

It might also use ActorSystem resources, for example the Scheduler or ExecutionContext.

rkuhn commented 7 years ago

Okay: the purpose of asking these questions is to arrive at the essence of what it is that we need. With the actor-based approach we already have singletons (per system) that have a defined lifecycle, resource allocation, etc. What you are asking for is something with basically the same qualities, but with synchronous API. This does imply concurrency control, coordination, decoupling, … which makes me wonder why an Actor is not the better choice. My intuition still is that there are two cases:

“heavy” (stateful, resource hungry) well-encapsulated asynchronous singleton services
light-weight stateless functions

Is there really a third class? And does it need a different API? Crazy thought: offer the serializer behind a synchronous ActorRef—assuming that it is actually non-blocking and fast enough for this purpose, since otherwise I’d argue that it must be a proper actor anyway.

patriknw commented 7 years ago

I'm not convinced that it's a good idea to remove extensions, but we can start without them and see how it turns out. Then some things that are performance critical will have to be held by the ActorSystem itself, such as Serialization.

rkuhn commented 7 years ago

Yes, that might be a reasonable approach as well: since we know that certain things (like Serialization) are needed, they are not really extensions but rather essential parts. At least we can now look back on an experience of seven years when making that educated guess ;-)

rkuhn commented 7 years ago

But before we go there we should be sure about the API, I remember there being an open point concerning streaming serializers.

patriknw commented 7 years ago

Streaming serializers could be an additional thing. Not needed for remote messages and persistent events. Could be needed for persistent snapshots. In Artery we have introduced optional/compatible serialization with ByteBuffer and also made the api for that allocation free by not using Try in the signature.

I think breaking changes should go in Akka 3.0 and doesn't have to be done in first Akka Typed, which is happening before 3.0.

drewhk commented 7 years ago

Yeah, streaming serializers are only needed when large entities are involved, but a remoting or persistent events should not be that large. For large message transport/snapshotting we could provide alternate APIs that fit better that usage.

akka / akka-meta

Akka Typed: How much Breakage is Desired? #18