ylorph / RandomThoughts

Some Random Thougts
Other
58 stars 8 forks source link

Expectations for a Store: Should explicitly mention subs not missing or reordering events #6

Open bartelink opened 1 year ago

bartelink commented 1 year ago

in https://github.com/ylorph/RandomThoughts/blob/master/2019.08.09_expectations_for_an_event_store.md

There is:

ability to have subscription

But there is nothing that calls out that it should guarantee never to miss an event if you have a live subscription and there are multiple writers.

Until today I thought this should go without saying (and it's great that this doc is pithy and not full of legalese), but it has come to my attention that in some cases that delivering that 100% and/or documenting the likelihood of the absence of such a complete guarantee is ins some way debatable

I believe CosmosDB ChangeFeed, ~DynamoDB Streams,~ MessageDb category subscriptions guarantee this and document it as so (some digging may be required, I dont have citations). I believe ESDB should guarantee this, but there are far better qualified people than me to make the claim. For others, it gets more confusing; if this list mentioned it explicitly as being significant, it would elevate the need for stores to consider and/or answer the question.


Clarification re DynamoDB streams:

ylorph commented 1 year ago

yes , missing event detection / out of order can be done if you have a monotically increasing number on the streams you read it's a matter of keeping track of the last processed event number on the consumer side

Some message broker out there claim to guarantee that without that trick. Though they guarantee that on up to just before it enters the consumer: so kind of hand wavy claim .( because the consumer might crash) Especially that most people don't dig that far in those explanations.

so detection is the best you can do , and for that the easiest solution is each event in a stream has that increasing number. to make it work both the server & the consumer need to play along and follow some rules.

bartelink commented 1 year ago

I agree that every event in a stream needs a monotonically increasing number relative to the other items in the stream

Once you have that, you can describe the guarantees, i.e.:

  1. no gaps, i.e. after event N, I might see many events <=N. eventually I will see N+1. I will NEVER be presented with event N+2
  2. items getting reordered on some bus can cause such gaps (e.g. event per item fed through DDB streams)

The point is more about broken impls

If you are relying on a known broken impl, you can make provision:

If OTOH you have a guarantee (based on reasoning about how the impl works, or something like a Jepsen test), you can:

  1. just write your system without triple checking things, writing lots of confusing and long-winded just in case code and falling into programming by coincidence
  2. spend less time analyzing incidents because you can more quickly rule out missing or out of order deliveries - ultimately you might still go and triple check things in the end, but it will go to the back of a long list of things - you trust that it should just work

To put it another way: for stores and projection loops backed by MessageDB, EventStoreDB or CosmosDB, I would never write gap checking logic, or let anyone else do it.

I thought I had the same guarantee for Equinox.DynamoStore by @epNickColeman discovered there was a weakness. That has been fixed so I would hold the same position of not writing gap check logic there either.

Each of those stores have a monotonically increasing event index at stream level (and its presence and there not being gaps in it should be a universal requirement; it's not a lot to ask)

For other stores that have public code that I'm aware of which don't guarantee both delivery and ordering (there are multiple SQL backed, one Dynamo DB backed I'm aware of; I'm not going to list them - the goal here is that others can discuss it later wrt an agreed baseline expectation definition), I would expect/need: a) that they can furnish a stream level position so I can check for a gap/out of order delivery (that's pretty normal) b) that there could be a potential gap or out of order delivery

does the 'spec' demand no gaps in the event sequence numbers at stream level? If not, I think it should as a) gap detection and/or out of order delivery checks require a guarantee of no gaps in the source data per the above b) I don't think there are many that don't provide it, and it should always be realizable