A possible new architecture

It may be possible to simplify this project further. Today, the EventLog actor replies to its sender for validation, but it seems like this may be unnecessary.

Naive case Assuming the validator is static, the atomic store could be instantiated with the validator of a type such as (CurrentEvent, PastEvents) => ValidationResult. If this were the case, the EventLog could make its persistence decisions without needing to make any external communication, because the decision logic would be injected at startup time.

Plus async That assumes that validation can be done synchronously. More generally, the type would be more like (CurrentEvent, PastEvents) => Future[ValidationResult]. In this case, the stateful stashing behavior of the EventLog would still need to be retained, in order to guarantee atomicity of validation+persistence.

Plus auxiliary input data It may be desirable to have some data taken into account for validation, yet a different representation actually persisted as an event. For example, in canonical Event Sourcing, commands are requests to change state, and events are a record of what happened. Perhaps it would be more general to define the interface as (Command, PastEvents) => Future[Seq[Event]]. This generalizes over the concept of auxiliary input data.

Plus auxiliary output data The validation process has knowledge of the change in state effected by a command. It may be desirable to propagate some information out about the validation process (e.g. errors) or the resulting changes. So perhaps the most general validator interface would be (Command, PastEvents) => Future[(Seq[Event], AdditionalData)].

Side objective: Split out atomicity But it occurs to me that this behavior atomic processing of incoming commands is actually independent of the storage concern. Could this behavior be factored into a mixin or wrapper-actor? If so, the actual persistence part could potentially be written basically the same as in the naive case. With care for handing buffering, this could be a very generic way to allow actors to cope with async APIs in a way that also preserves the actor framework rule of sequential message processing.

It occurs to me that this sounds a lot like a Reactive Stream with a capacity of 1 before backpressuring. I think this may be a red herring though, because the actor would still need to be a Cluster Singleton (for atomicity) and accessible via Remoting, neither of which fit the Akka Streams model.

Requester might be an example to look at for how to make such behavior modular, although it isn't designed to be atomic.

See also Fun.CQRS for an attempt to model the pattern in types.

Been working on further thoughts on how this would look in implementation. I'm interested in seeing whether it's possible to factor the functionality of Atomic Store into smaller, more generic & cohesive pieces. This could have a lot less protocol than today's Atomic Store.

Another thing that is codified here is the use of async method API facads (i.e. Future-returning methods) instead of presenting actor refs. I think this is more type-safe, more encapsulated, and less error-prone. But to take advantage of Actor supervision, all the pieces are still implemented with actors and set up so that the supervision chain is intact.

It occurs to me that an async interface to a simple log could be shaped like:

// Interaction would be through an async interface,
// rather than explicit message passing.
trait Log[EventType] {
  def persistenceId: String
  def read: Future[Seq[EventType]]
  def append(events: Seq[EventType]): Future[Done]
}

object Log {
  def apply[EventType](persistenceId: String)(implicit actorRefFactory: ActorRefFactory) = Log[EventType] = ???
}

Then the whole machine would be:

An extremely simple PersistentActor, wrapped with the above interface.
A Flow for managing the atomic request queue

// General purpose utility for allowing actors to process 
// messages asynchronously in strict sequence, since the
// actor model only provides this guarantee synchronously.
def atomicFlow[Input, Output](
  bufferSize: Int, 
  bufferOverflowStrategy: OverflowStrategy
)(
  f: Input => Future[Output]
)(
  implicit 
  mat: ActorMaterializer
): ActorRef = {
  Source
    .actorRef[(Input, ActorRef)]
    .mapAsync(1) { case (input, replyTo) => (f(input), replyTo) }
    .to(Sink.foreach { case (output, replyTo) => replyTo ! output })
    .run()

An actor for binding the log to the atomic flow and taking

class AtomicEventLog[CommandType, EventType, OutputType](persistenceId: String, validator: (Command, Seq[EventType], Seq[EventType] => Future[Done]) => Future[OutputType) {
  implicit val materializer = ActorMaterializer()
  val log = Log[EventType](persistenceId)
  val atomicFlowActorRef = atomicFlow(10, OverflowStrategy.DropLast) { command =>
      validator(command, log.read, log.append)
  }

  // Need to figure out how to apply timeout rules and 
  // to make `atomicFlow` failures are supervised by
  // this actor.

  def receive = {
    case command: CommandType => atomicFlowActorRef ! (command, sender())
    case EventsForId(persistenceId) => log.read pipeTo sender()
  }
}

This would be instantiated within a Receptionist, as it is today. It would be responsible for the life-cycle of the system and supervision.
The AtomicStore class would take the validator as an argument and export the Props for the receptionist.
The receptionist would be started as a Cluster Singleton.
The application would instantiate the atomic store, passing a validator. The dynamic aspects of the process are arguments to the validator, and it can return an arbitrary data structure. Below shows how this would for a validator that requires a user-submitted application-level command and configuration that is stored outside of the event log. To Atomic Store, the command is the pair of those pieces of data. So now we have an Atomic Store, prewired with its validation logic, rather than the "validation dance" required in the current version.

// TODO: How to make sure that timeouts cancel work that is
// mid-stream?

object MyAtomicStore extends AtomicStore[CommandAndConfig] { 
  case ((command, config), pastEvents, appendToLog) =>
    async {
      val calculator = Calculator(config, pastEvents)
      val (newEvents, additionalData) = await(calculator.validateCommand(command))
      await(appendToLog(newEvents))
      (newEvents, additionalData)
    }
  }
}

In the service layer, this could be invoked much more simply:

  def processCommandForId(command: Command, scope: String) = async {
    val config = await(configForId(scope))
    await(atomicStore.processCommand((command, config), scope))
  }

artsy / atomic-store

A possible new architecture #20