akka / akka-meta

This repository is dedicated to high-level feature discussions and for persisting design decisions.
Apache License 2.0
200 stars 23 forks source link

How to compose akka-typed actors with akka-persistence? #7

Open Horusiath opened 9 years ago

Horusiath commented 9 years ago

Right now we are able to define actor as a single function behavior - using akka-typed (JVM) or F# API (.NET). However this doesn't suit well with existing persistent actor design, for various reasons, mostly concerning actor's state. Few issues comes to my mind:

So my question to Typesafe guys - what are your thoughts about possible integration of akka typed and persistence in the future?

rkuhn commented 9 years ago

Recently I started experimenting with capturing “actor effects”, which is rather similar to defining a process algebra for our Actors, and «persist» was indeed one of the core effects I identified. Modeling it this way makes for a quite natural formulation: «persist» returns an Action that will loop the given event through the Journal and expose it as its output value, meaning that this will need to be composed with follow-up effects that describe the future behavior of this Actor. This scheme has the added benefit of avoiding the ambiguity with persist/persistAsync.

The more interesting question then becomes how to recover? My current design foresees another core effect for this purpose that takes the description of a folding process (initial value and state aggregation function) and returns the final state value from which the Actor can then take the next steps, typically by turning it into a Behavior (which just means describing the next steps using other actor effects).

The drawback of all this is that it will be rather allocation-heavy and the syntax is also not quite as nice as I’d like it to be—monads are not trivial to express in Scala and even more so in Java—therefore consider this as an intermediate snapshot of my thought process rather than a final result.

notxcain commented 9 years ago

@rkuhn did you evolve your vision on this topic somehow?

rkuhn commented 9 years ago

Unfortunately my time has been more than just a bit limited lately; I am tinkering with this but have not had a break-through yet. If anyone else has ideas or even PoCs please share!

durban commented 7 years ago

@rkuhn We have a proof of concept for this at https://github.com/nokia/akka-typed-persistence.

rkuhn commented 7 years ago

Cool, will take a look over the weekend!

ktoso commented 7 years ago

Very interesting, I will too (I'm very interested from the persistence side of things :-)). Thanks for sharing!

rkuhn commented 7 years ago

@durban After finishing my thought processes on the Actor process DSL (see the example on a branch without an implementation at this point) I took a look at akka-typed-persistence, and it is pretty cool and matches lots of my current work. I have not yet had the time to think it through, but merging the two algebras might actually make sense.

In terms of implementation I prefer to roll my own Free; I don’t like the boilerplate that it brings, which probably also has runtime overhead in terms of excessive allocations, but more importantly we must ensure that .map actually flattens if the result is a process because otherwise it becomes impossible to write infinitely looping processes without a memory leak (at least when using for-comprehensions).

dispalt commented 7 years ago

On the algebra allocation side it might be interesting looking at the finally tagless style which helps on allocations. You can look at @S11001001 has done with it in this article, https://failex.blogspot.com/2016/12/tagless-final-effects-la-ermine-writers.html

durban commented 7 years ago

@rkuhn I'm glad if our work can be of some use. I'll take a look at the Actor process DSL when I'm back from vacation. Regarding Free: it's really just an implementation detail; what you say makes sense, I'm absolutely not opposed to a custom monad. Frankly, Free was just the simplest way for us to get it working, I'm open to better solutions.

@dispalt Thanks, tagless final is definitely on my list of things to look at.

rkuhn commented 7 years ago

@durban I have pushed a commit that adds event sourcing effects to the process algebra, please take a look. The most significant difference to nokia/akka-typed-persistence is that in order to retain full componsitionality the state itself must also be compositional. Avoiding type madness and lenses means offering a (strongly typed) state management facility with multiple slots. How the resulting events will be split into persistenceIds or tags is not yet clear; the same goes for how to make the state (or parts of it) persistent in the first place. One possible idea is to add another effect that requests the data for a certain key to be persistent, including specifying persistenceId or tags. The state management facilities are useful without persistence as well.

durban commented 7 years ago

@rkuhn I looked at the process DSL and especially the event sourcing operations. Not all of the things are clear to me yet, but I've tried to write an example: https://gist.github.com/durban/5034b9e3b27a6e08aa2b90f2943218dc This is how I imagine I'd use the API to write a persistent counter which can be incremented/decremented and also takes snapshots. Is this similar to how it's intended to be used?

A few more questions:

rkuhn commented 7 years ago

Yes, that sample is spot on.

When talking about compositionality I mean that the behavior of an Actor can be composed from small behavior snippets that run sequentially or concurrently (but not in parallel). The && and || combinators in the ScalaDSL object are not really suitable for this purpose, having to funnel all inputs through a single ActorRef is just too limiting. This is why with the process DSL every process has its own ActorRef (in addition to the one main ActorRef[ActorCmd[T]] for the whole actor).

Packaging the update logic with the storage key (which does select the slot) is the best I could come up with, the state is a projection that is specific to the actor processes and therefore belongs with the actor and not with the event types. This also allows reading the same events from multiple processes for different purposes (each using their own key for local handling).

fork does indeed create a FunctionRef which contains a bounded queue; this queue is then enqueued as a message to the actor which runs the interpreter for the Operation ASTs. Using this scheme allows all messaging to be allocation-free (assuming small bounded queues that are preallocated).

durban commented 7 years ago

Thanks, I think I'm starting to understand what compositionality means in this case. I'm also looking at akka/akka#22087, the interpreter there is really interesting.

patriknw commented 7 years ago

Here is our first stab at the persistence api for Akka Typed (without process dsl): https://github.com/akka/akka/pull/23674

notxcain commented 7 years ago

Hi! I do understand that throwing exceptions is kinda natural for Actors. But, don't you think that this could be more pure and total if result is wrapped in Option-like structure with more suitable name, representing cases for successful and impossible folds? And then the underlying actor itself would throw something like IllegalStateException with information about what state and event caused it.

patriknw commented 7 years ago

Validation of the command and that the event to persist makes sense should be done before persisting, and typically result in a reply message back to the sender that it was invalid. That line is just a precaution in case application code is not implemented in that way, too avoid storing events that will later not be possible to replay.

notxcain commented 7 years ago

@patriknw I understand that, my concern is that it requires user code to throw an exception rather than returning something representing an error

raboof commented 7 years ago

@notxcain hmm, can you give an example of a situation where it would be acceptable for applyEvent to throw (c.q. return an error)? Typically you'd want to do any validation before constructing the effect that persists the event, so that afterwards you can be confident it will not fail.

What should happen in such error cases? I'd say it's still a crash you'd want to leave the the supervisor, right?

notxcain commented 7 years ago

@raboof The need emerges naturally when you start to follow your types. Let's say we have Door entity with two possible states Closed and Open. And two DoorEvents corresponding to state transition DoorOpened and DoorClosed. So we need an update function.

def update(door: Door, event: DoorEvent): Door = (door, event) match {
  case (Closed, DoorOpened) => Opened
  case (Open, DoorClosed) => Closed
  case (Closed, DoorClosed) => ??? // This is impossible, what to return here?
  case (Open, DoorOpened) => ???
}

It is impossible to provide a result for all possible combinations of function arguments, hence this function is not total. I really want akka-typed to be typed and total, please correct me if I'm wrong and this is not a part of the goals. What I propose is to have something like this

def update(door: Door, event: DoorEvent): Folded[Door] = (door, event) match {
  case (Closed, DoorOpened) => Next(Opened)
  case (Open, DoorClosed) => Next(Closed)
  case (Closed, DoorClosed) => Impossible
  case (Open, DoorOpened) => Impossible
}
raboof commented 7 years ago

I wholeheartedly agree we want akka-typed to be typed and total!

With persistence, the state is updated after deciding to persist the event - because on recovery (for example after migrating the actor to another node or after a restart/upgrade), the same update function must be used to recreate the state based on the persisted events. For that reason, in your update function you must consider your events as 'things that definitely happenend', it makes no sense to be able to 'deny' the fact that they happened.

In the doors example above, I'd:

def update(door: Door, event: DoorEvent): Door = event match {
  case DoorOpened => Opened
  case DoorClosed => Closed
}

Or perhaps even:

def update(door: Door, event: DoorEvent): Door = event match {
  case DoorOpened =>
    if (door == Opened) log.warn("Door was already open, being resilient")
    Opened
  case DoorClosed =>
    if (door == Closed) log.warn("Door was already closed, being resilient")
    Closed
}

It seems to me that the fact that you cannot return a Impossible is a feature here rather than a limitation: it's a way to encode in the type system that you cannot 'deny' that an event happened when updating the state. It pressures you to find an implementation where your update function and event modeling is as flexible as possible.

Returning Try[State] might be a way to make that explicit and 'look more functional', but using exceptions for those should-be-rare situations actually seems suitable to me.

That said we are still iterating on this API, so comments like this and further examples are definitely very welcome - but for now I'm still leaning towards liking a total function that must accept each event and produce an updated state, with only exceptions as 'escape hatch'. After all what would you expect to happen when returning Impossible? That should be considered a failure and crash the actor, right?

dispalt commented 7 years ago

I agree with @raboof I think you'd had needed to validate the command with the state before saving the event, and that's when its appropriate to reject.

rkuhn commented 7 years ago

This is an example of the difference between types (which are largely syntactic) and semantics: commands express intent whereas events represent irrefutable facts. The door example is one that is often used by Edwin Brady to demonstrate the power of dependent types, but still it remains impractical in general to capture the full semantics within a type signature. One notable downside of pushing the types in this direction is that they lose the power that they derive from being a classification—a type is only useful if it stands somewhere in the middle between a single value and the unconstrained value domain. Single-inhabited types are so precise that they shift the value-level problem fully into the type domain, which has mostly downsides.

notxcain commented 7 years ago

@raboof of course the example is over simplified. From my experience there a lot of this impossible pairs of value.

Developers at work used to ask what to do in this cases, either throw, which we prefer to never do on this layer, or ignore event by returning unchanged state, which is also bad. So I introduced this Folded data type. Yes, underlying actor throws, as it means that really something bad is going on. Logging in apply function is definitely something to avoid by all means. So what to do then?

@dispalt I don't argue with that.

rkuhn commented 7 years ago

@notxcain I don’t agree that returning unchanged state is wrong, and the example is fine to demonstrate semantics: if the door is already open and the event arrives that it has been opened, then of course the door remains open. An event is a fact, deal with it.

You could say that something must have gone wrong somewhere for this to occur, and that may be right, but do you consider it good implementation quality to react by terminating the process instead of carrying on to the best of your abilities? I think this question deserves different answers for local and distributed systems, where local means tight coupling and thus eager failure and distributed means loose coupling and a preference for robustness (i.e. tolerating aberrant but benign behavior of other parts of the system). Here “let it crash” means to eternally fail this actor, it will never properly recover, which goes against the purpose of crash-only software.

notxcain commented 7 years ago

@rkuhn An event is a fact, deal with it. - this obvious statement has triggered something in my perception of the problem 😉 Given that it is already persisted and all other parts of the system also would have to deal with it. Now I need some time to reconfigure my point of view.

Also I want to quote @patriknw response

That line is just a precaution in case application code is not implemented in that way, too avoid storing events that will later not be possible to replay.

So should this function (E, S) => E ever throw? Are there events that are impossible to replay?

notxcain commented 7 years ago

Another example is an Invoice aggregate, it's state is Option[InvoiceState]. What to return for a pair (None, e: InvoicePaid)? None? It's obviously something strange that should be reported somehow. Reporting is an effect. How to deal with that?

rkuhn commented 7 years ago

If the actor decides that the event log is corrupted and that it cannot continue, then it can and should terminate with a failure. In this case, the failure will be permanent since a valid journal implementation will always replay the same sequence of events, leading to deterministic recovery behavior. Someone would have to manually rewrite history to fix this actor, then.

Failure is signaled by throwing an exception, and it should be irrecoverable as far as the actor is concerned.

raboof commented 7 years ago

Yes - aside from rewriting history, another solution can be to upgrade your application to a version that does handle the until-then-apparently-unexpected sequence of events.

@notxcain you mentioned before that 'logging in the apply function is definitely something to avoid by all means', but in this case where "something is definitely wrong and a person should look at this and decide how to resolve the situation", I think logging a warning might in fact be warranted. You will need to have some kind of (dev)ops workflow in place that makes sure somebody indeed sees this logging in reasonable time, but that's warmly recommended in any case :).

notxcain commented 7 years ago

@rkuhn

If the actor decides that the event log is corrupted and that it cannot continue, then it can and should terminate with a failure. Failure is signaled by throwing an exception, and it should be irrecoverable as far as the actor is concerned.

And that is exactly what I propose. If Impossible is returned - underlying action should throw and terminate.

rkuhn commented 7 years ago

There wouldn’t be any difference to just throwing instead of returning Impossible, right? Throwing an exception is pure as long as it leads to process termination, which is the case here.