jamii / streaming-consistency

Demonstrations of (in)consistency in various streaming systems.
https://scattered-thoughts.net/writing/internal-consistency-in-streaming-systems/
23 stars 6 forks source link

Akka? #2

Open leviramsey opened 3 years ago

leviramsey commented 3 years ago

Great set of articles!

I'm a little struck by the exclusion of the Akka ecosystem (partly because I've been involved in migrating a large Spark batch pipeline to real-time streaming with Akka). Akka Streams + Persistence together comprise a low-level library that's more akin (in terms of abstraction level) to differential-dataflow.

I'm game to implement this in Akka if there's interest in inclusion.

jamii commented 3 years ago

I had a quick look and don't see any support for joins or for retraction-aware aggregates. It looks like it's in the same boat as storm/spark/samza etc - you could build a library for time-varying collections on top of it, but then that's the library we should be testing, not the underlying layer.

leviramsey commented 3 years ago

One would idiomatically implement both in Akka using actors (and event sourcing). It's a somewhat different model from those here, which is part of why it might be interesting.

jamii commented 3 years ago

I agree, my point is just that akka itself doesn't provide most of the features that I'm testing in this repo. If you wanted to write a library on top of akka that provides consistent operations on time-varying collections, then we could test that and talk about it's properties. But akka itself would just be a delivery layer, and whether or not your code was consistent would depend on what code you write and not on what facilities akka provides. Whereas code written for ksqldb or flink tables is always eventually consistent and code for differential dataflow is always internally and eventually consistent. "Is akka consistent?" is not a question we can usefully answer because it's a lower level layer.