ylorph / The-Inevitable-Event-Centric-Book

Some day, someone might write an authoritative book about this aspect. Let's call that inevitable book Event Centric as a placeholder title (this is a quote...)
71 stars 7 forks source link

RFC: Invariants across multiple streams / consistency across multiple streams #55

Open MerrionComputing opened 3 years ago

MerrionComputing commented 3 years ago

Is there any resources you recommend to read for dealing with invariants that involve multiple streams?

Thoughts:-

MerrionComputing commented 3 years ago

You can lock all the streams involved whilst you check the invariant

This is fragile because if that process falls over it can leave locks in place... so we need to have some way of detecting orphan locks and cleaning them up.

MerrionComputing commented 3 years ago

You can use a specific stream that represents the "transaction" and use saga-like operations to reliably undo any partial updates

This is a lot of extra work and also there is a problem if the intermediate state is acted upon, can you reverse those actions when you undo the state?

MerrionComputing commented 3 years ago

You can just fall back on the transaction implementation of an actual RDBMS

Database transactions are the mortal enemy of distributed systems. If you use them you stand to lose a major benefit of event sourced systems.

johnbywater commented 3 years ago

I don't yet know how to express the following in general terms, but for example a common "invariant" such as "the name of an aggregate must be unique in the domain model" can be implemented by a deterministic mapping (e.g. version 5 UUID) of the name to the identity of a "name index aggregate" that is of the same form as the aggregate (e.g. UUID), and then that index aggregate can be checked to see if the name is in use or not. This can be combined with atomic recording of the index aggregate and the named aggregate, so that aggregates don't ever get recorded as having a name that isn't unique.

I suppose this technique could be extended to support other kinds of invariants?

Narvalex commented 3 years ago

I don't yet know how to express the following in general terms, but for example a common "invariant" such as "the name of an aggregate must be unique in the domain model" can be implemented by a deterministic mapping (e.g. version 5 UUID) of the name to the identity of a "name index aggregate" that is of the same form as the aggregate (e.g. UUID), and then that index aggregate can be checked to see if the name is in use or not. This can be combined with atomic recording of the index aggregate and the named aggregate, so that aggregates don't ever get recorded as having a name that isn't unique.

I suppose this technique could be extended to support other kinds of invariants?

EventStoreDB does not support atomic recording of more than ONE stream. How would you deal with that?

johnbywater commented 3 years ago

EventStoreDB does not support atomic recording of more than ONE stream. How would you deal with that?

I suppose you would have to check before and then stop, and check after and then undo. Or use another database that does support this? Cassandra, with its LWTs, also doesn't. I'm not sure if AxonDB does or doesn't support this. Same for DynamoDB, but the impression I have is that it does. It would be nice to have a table showing what is possible with each DB.

Finding out whether EventStoreDB does or doesn't support this was something I asked about for a long time, without getting a clear answer. But I think you are right that it doesn't support this.

I was told by the director that there has been some discussion at EventStoreDB about supporting this, and I'm not sure if the current plan is to implement support for this or not. I was told the team was looking closely into this, and identified only one use case when this would be strictly needed, but I don't know what that use case is. I think in many cases, for many line of business applications, the volume and velocity of domain events can be supported perfectly well with an RDBMS. Greg Young has made this point in several talks on YouTube.

I suppose that's why my focus with event sourcing tends to be on approaches that keep the application code relatively straightforward: approaches that extend the consistency boundary to include new domain aggregates from more than one aggregate, and other things too. Hence opening up possibilities that are excluded by operating within relatively severe constraints that are perhaps (or perhaps not) applicable when operating at unusually large volume and velocities of domain events. EventStoreDB seems to be dedicated to these unusual situations, of high volume and velocity, when you have to think very hard about how to make things work. And so I have several times contributed solutions like the above in discussions like this, partly to try to suggest that event sourcing in itself isn't merely what you can do with EventStoreDB. The technical difficulties which you are forced into when trying to accomplish things which EventStoreDB doesn't support can be avoided by dropping the assumption that you need to accept the constraints it provides, that you always need to get involved in those technical difficulties in order to do event sourcing at all. But at the same time, I acknowledge that these constraints do provide for certain situations, and in those cases, you do need to get involved in those technical difficulties.

So basically, I'm just broadening the discussion, to include "small data" and perhaps "medium data", and in doing so I hope neither you (nor anybody else) minds too much. :-)

edblackburn commented 2 years ago

You've got your boundary wrong if you need to enforce your invariant across multiple streams. If you can afford the latency, you can build a new stream from a projection that combines others. But if you're trying to enforce invariants, you need strong consistency, which means you need all your data in the same stream.

Narvalex commented 2 years ago

You've got your boundary wrong if you need to enforce your invariant across multiple streams. If you can afford the latency, you can build a new stream from a projection that combines others. But if you're trying to enforce invariants, you need strong consistency, which means you need all your data in the same stream.

If you could model all your transactional needs in one stream, then Sagas/Process Managers and Multi-Streams Transactions wouldn't exist in Event Sourced Systems. But they do for a reason.

edblackburn commented 2 years ago

Exactly my point. Your boundary is wrong if you can't fit all the variables you need into a single transaction. If you can't increase the size of the boundary, you'll have to look at other patterns that permit eventual consistency. Still, eventual consistency means you're liable to race conditions, so you are not enforcing the invariant; you're giving it your best effort.

Narvalex commented 2 years ago

With proper locks set in, no need to enter the dangerous racy conditions that can lead to eventual in-consistent state. Best effort can lead to potential big headaches if it is done in critical areas. Do not fear the locks, if it is applied carefully only in the entities/streams that matters. The peace of mind and the safetyness that adds will be more valuable than a little bit of extra performance gains.