gazette / core

Build platforms that flexibly mix SQL, batch, and stream processing paradigms
https://gazette.dev
MIT License
718 stars 52 forks source link

consumers: end-to-end exactly once semantics by default #210

Closed jgraettinger closed 5 years ago

jgraettinger commented 5 years ago

Scope is to transition consumers from at-least-once to exactly-once, by default. This builds off of the approach and work undertaken with liveramp/factable

Basic implementation sketch:

Where today only read-through offsets are check-pointed with consumer transactions, for exactly-once semantics a shard checkpoint will include:

I'm also targeting improvements which allow shards to opt-out of recovery logs altogether, utilizing an external transactional store instead. I think it's possible for transactions to remain fully pipelined in this case also, which would be awesome for performance.

adayNU commented 5 years ago

I'm also targeting improvements which allow shards to opt-out of recovery logs altogether, utilizing an external transactional store instead.

👏 👏 👏

jgraettinger commented 5 years ago

Done!