Closed berkes closed 1 month ago
Hi @berkes, thanks for the message. The optimistic lock is meant only to ensure that the same aggId-sequence value does not get duplicated in the DB to get that optimistic locking. The CQRS framework should guarantee that the attempt is made with the next sequence number available.
I may be missing the potential issue though (a not uncommon issue). What is the failure mode that you're worried about here? And what are your thoughts on the fix?
I came across this when implementing implementations of PersistedEventRepository for sqlite and in-memory.
I don't have any issues in practice that I've encountered because of this; it's purely theoretically.
One case where the issue might be problematic in practice, is when there are holes in the sequences. I guess simplly solved by "ensuring there are no holes in the sequences" though 😉.
When we have event 1, 2, 99 and 100, we don't guarantee that the next event must be 100+1 but only that it did not yet occur; so after 100, we could have event 3, 4, etc.
In other CQRS and ES systems that I worked on, the optimistic locking ensured that "the next event ID as provided by the requester-to-store much match the next event ID as the storage thinks it should be". E.g. by opening a transaction, incrementing a counter, checking the provided event-id against that incremented counter, and if matched inserting the event and committing the transaction.
So my title "strange behaviour" is actually wrong: I should have called it something like "this is not how I expected it to work" 😄
I'm happy to close this thread if this is "by design". Which I can understand. It was just not what I was expecting.
Ah, I see. This should never happen, but if it does it becomes a problem. Of course in production I've found that should never many times means usually will due to a bug or some other reason (an engineer finds a quick way to fix a state problem in production is to inject a certain event but fumbles getting the sequence correct).
It seems like we might need a couple of things here:
TBH I haven't seen this transaction pattern before, I'd love your thoughts on the best way to do this.
While investigating how the optimistic locking is protected, I found that apparently it's not protected.
I'm not sure if that's by design, an oversight or a todo.
Looking at the test https://github.com/serverlesstechnology/postgres-es/blob/main/src/event_repository.rs#L384C13-L384C33 we see that it is tested for. However, that test gives a false positive, as it fails on a duplicate-index (aggregate_id=x, and sequence=2) and not on the fact that 2 comes after 3. Reading through the implementation, I see nothing that checks that sequences must be higher than what is in the database.
Is the optimistic locking meant to merely protect against a sequence number (icw an aggregate_id) from occurring twice? Or should It protect against sequence numbers being lower than what has been received already?
If wanted, I can whip up a patch that introduces such a protection, but the actual implementation needs some thought then (to avoid race-conditions between read-current-sequence and write-new-sequence). Please let me know if this is desired.