RBMHTechnology / eventuate

Global-scale event sourcing and event collaboration with causal consistency (This project is in maintenance mode. Only critical bugs will be fixed, but there is no more feature development.).
http://rbmhtechnology.github.io/eventuate/
Apache License 2.0
708 stars 99 forks source link

Causal stream merge stage #342

Open krasserm opened 7 years ago

krasserm commented 7 years ago

Provide an Akka Stream stage that can merge n input streams (produced by DurableEventSources) by preserving causality i.e. the merged stream should still be a valid linear extension of the potential causality relation of DurableEvents. Merging can be done with linear complexity by scanning the input streams only once.

Causal stream merge is relevant when merging events from two or more sources that causally depend on each other. For example, if events from log A have been processed into log B, B causally depends on A. A plain merge of A and B would ignore this causal relationship and produce a stream in which effects may appear before their causes. Causal stream merge preserves causality and guarantees that effects only appear after their causes in the merged stream.

A use case is processing a stream, merged from A and B, with a single stateful DurableEventProcessor that writes processed events to target log C. Without doing a causal stream merge, the stream processing setup would be equivalent to two parallel processing pipelines, one from A to C and the other B to C each having its own processor instance. This however breaks state computation as state cannot be shared between processor instances.

An application could still decide to ignore the causal relationship of events from logs A and B by using a plain merge. The generated vector timestamps would just indicate their concurrent processing which is completely fine for some use cases. However, applications that want to do sequential processing of merged streams with a single processor should use causal stream merge, especially if the processor is a stateful processor whose processing logic relies on causal event delivery.

Causal stream merge works for arbitrary processing networks including cyclic ones. The processing network may overlap with replication networks if they are not filtered. Filtered replication connections may break the causal merge algorithm.