prooph / event-sourcing

Provides basic functionality for event sourced aggregates.
http://getprooph.org
BSD 3-Clause "New" or "Revised" License
264 stars 42 forks source link

How to check if an aggregate ID exists? #79

Closed enumag closed 6 years ago

enumag commented 6 years ago

Very often when handling a command I need to check if an aggregate with the given ID exists. I'm mostly using AggregateRepository::getAggregateRoot() for that purpose but in my opinion it is not a good method for this case. I don't need the aggregate itself or any of its events after all so getAggregateRoot is actually doing a lot of useless work.

I was unable to find any more suitable method for this. Can you tell me how do you solve this in your applications? Do you think introducing a new method to AggregateRepository would be a good idea? Any tip how to implement it? EventStore doesn't seem to have a good method for this either (hasStream doesn't help because I don't use one-stream-per-aggregate strategy).

enumag commented 6 years ago

Note: Using projections for this seemed like a good idea at first but it causes problems with our import where we need to run thousands of commands as quickly as possible. If every command has to wait until projections are updated to pass an existence check then the import gets really slow. Therefore I'd like to avoid using projections for these checks.

prolic commented 6 years ago

why not use an array of ids inmemory for import? or store them in memcached, redis, .. ?

enumag commented 6 years ago

@prolic While I could probably do that it feels more like a workaround than a proper solution.

alanbem commented 6 years ago

Well, imho aggregate does not exist until there is at least one event generated by it. So, maybe count events for aggregate or something (sorry, I am not prooph professional, just subscribed to it out of curiosity :) )

Does proophs' event storage allow to count events? It would be simple way to add exists($id) on aggregates repostories.

enumag commented 6 years ago

@alanbem No, they don't.

prolic commented 6 years ago

@enumag Do you have other use cases? If some import script that is running only once is the only use case, then we probably don't need that addition.

/cc @codeliner

enumag commented 6 years ago

@prolic Well I imagine it might also be a problem for some sagas that need to dispatch a lot of commands in sequence. In my case some monthly or yearly audits will certainly be a thing later on. Also it would remove the current problem of the GraphQL API where you can't use an ID of newly created aggregate in other mutations immediately because it's not yet propagated. GraphQL supports sending multiple mutations in one HTTP request but it wouldn't work with projection-based ID checks.

Right now I feel tempted to add a service that will simply use the same PDO connection as Prooph to check it directly in the database. It would be better to have a support for that in prooph though.

How do you handle ID existence in your applications?

prolic commented 6 years ago

Actually I never had a need to check for existence of an ID. I only have 2 cases: 1) I want to append some events to an aggregate, hence it exists or I send a 404 if not. 2) I want to create a new aggregate (due to uuid-usage, I don't need to check it)

enumag commented 6 years ago

What about cases when you need to add an aggregate that is somehow related to another aggregate and therefore holds its ID? I have such cases literally everywhere... Contract holds references to the contractors and the apartment the contract is for and also to accounting organization which holds some settings that are later used for accounting. I need to check all these ids to create the contract aggregate. Then there are lot of fees which need to be connected to a financial account but are sufficiently complex to be a separate aggregate.

prolic commented 6 years ago

You have those relationships, hence the aggregate does exists.

On Tue, Jun 19, 2018, 04:22 Jáchym Toušek notifications@github.com wrote:

What about cases when you need to add an aggregate that is somehow related to another aggregate and therefore holds its ID? I have such cases literally everywhere... Contract holds references to the contractors and the apartment the contract is for and also to accounting organization which holds some settings that are later used for accounting. Then there are lot of fees which need to be connected to a financial account but are sufficiently complex to be a separate aggregate.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/prooph/event-sourcing/issues/79#issuecomment-398183196, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYEvK2SCQhqSnDYday5biWRw6XJq3Qcks5t-AwbgaJpZM4UrZwi .

enumag commented 6 years ago

@prolic What? I don't think you understand... For example I get a request to create a new Contract aggregate. The request contains IDs of the contractors and ID of the apartment (for simplification, there are several more in reality). So before I can create the new Contract aggregate I need to check that there is a Contact aggregate in the database for both of the given contractor IDs and that there is a Apartment aggregate with the given apartment ID. I can't allow a contract with nonexistent IDs to be created, can I?

prolic commented 6 years ago

In this case you have to load the complete aggregate anyway, because it could have state = DELETED or any other case that forbids creation of a contract.

enumag commented 6 years ago

While that certainly applies in some cases, it doesn't actually matter to us most of the time. States that forbid new relations are quite rare in our domain.

prolic commented 6 years ago

ping @codeliner your thoughts?

enumag commented 6 years ago

By the way this leeds me to an off topic question. In proophesor aggregates never had any getters. Is it a good practice to not have getters on aggregates? If that's the case then even if load an aggregate I would not be able to check its state.

prolic commented 6 years ago

Have accessors only for what you want to expose, usually that's not everything.

Sascha-Oliver Prolic

2018-06-19 4:48 GMT+08:00 Jáchym Toušek notifications@github.com:

By the way this leeds me to an off topic question. In proophesor aggregates never had any getters. Is it a good practice to not have getters on aggregates? If that's the case then even if load an aggregate I would not be able to check its state.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/prooph/event-sourcing/issues/79#issuecomment-398190559, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYEvIY9xsw98JHWvBzxVaOp1DRymlm7ks5t-BIWgaJpZM4UrZwi .

codeliner commented 6 years ago

sorry missed that discussion ....

Is it a good practice to not have getters on aggregates?

Yes! I know it is hard to achieve. The illusion of consistency pushes us in the direction of exposing aggregate state, because the aggregate is consistent and the read model is not. But when exposing aggregate state you couple your aggregate with services, command handlers, ... And sooner or later you're no longer able to refactor or remove the aggregate, because its getters are used everywhere in the codebase.

You have those relationships, hence the aggregate does exists.

Exactly! If you really want a system that processes thousands of commands in a couple of minutes you have to design your system in a "reactive way". See Reactive Manifesto Especially the Resilient section is important here:

I can't allow a contract with nonexistent IDs to be created, can I?

Why not? What happens in that case? Can the business deal with it? A technical exception rejecting the command is not useful. Why are wrong IDs referenced in the first place?

A resilient system accepts the command CreateContract and processes it. Then if you want to make sure that Contractor and Apartment are known in the system you either check that during read model projection or with a process manager. In both cases you access the read model to answer that question. Let's say Contractor cannot be found. First you can try again in a few minutes, maybe the read model is not up to date. If Contractor is still not in the read model you should send the Contract aggregate a command to mark the contract as invalid. You should also inform support and/or business about the issue so that it can be resolved manually.

I know this is a lot of stuff to code, but that's the way a reactive, elastic, resilient, message-driven system works. Such a system can be extremely robust and handle almost any issues without human intervention. The time you spent building it it saved in the long run when this system runs in production without or with very few issues.

In case you don't have the time for the solution described above:

Use EventStore::load() in your aggregate repository and limit the result to just one event of the aggregate (using MetadataMatcher). If one event is found your aggregate exists.

We can think about adding a EventStore::countEvents() method like suggested by @alanbem but that would be a BC break so only an idea for event-store v8.

enumag commented 6 years ago

@codeliner Thanks for the answer. You're mostly repeating what I previously read elsewhere. I definitely don't want to deny something I read from multiple sources written by people more experienced than I am. The thing is that whatever source I read so far none addressed my concerns about such approach. Can you answer them?

Yes! I know it is hard to achieve. The illusion of consistency pushes us in the direction of exposing aggregate state, because the aggregate is consistent and the read model is not. But when exposing aggregate state you couple your aggregate with services, command handlers, ... And sooner or later you're no longer able to refactor or remove the aggregate, because its getters are used everywhere in the codebase.

That's my concern exactly. Any tips how to solve it? You can't really sacrifice consistency, right?

Why not? What happens in that case? Can the business deal with it? A technical exception rejecting the command is not useful. Why are wrong IDs referenced in the first place?

Well the IDs are user input so I can't simply trust that they exist. And when it doesn't I need to tell that to the user and not just pretend that everything is ok. The user would be unaware that he did something wrong. So that exception is not only useful but necessary.

The system is an API so technical exception is just fine. The developer who works with the API will know what to do with it. In case it is a browser client it should translate the exception into something readable by the user (if the error is something he can fix).

A resilient system accepts the command CreateContract and processes it. Then if you want to make sure that Contractor and Apartment are known in the system you either check that during read model projection or with a process manager. In both cases you access the read model to answer that question. Let's say Contractor cannot be found. First you can try again in a few minutes, maybe the read model is not up to date. If Contractor is still not in the read model you should send the Contract aggregate a command to mark the contract as invalid. You should also inform support and/or business about the issue so that it can be resolved manually.

Disregarding the fact that it's a lot of code to write, the main reason why I dislike it is that the system can't respond to an invalid command with an error. Instead it handles it and only founds out later that the command was wrong. So firstly the user won't know something went wrong and secondly if someone was to send invalid commands one after another (intentionally or by accident) the system would have to save them instead of denying them which would quickly make the EventStore full of absolutely useless data.

fritz-gerneth commented 6 years ago

the main reason why I dislike it is that the system can't respond to an invalid command with an error. Commands do not respond at all.

Instead it handles it and only founds out later that the command was wrong. Which means the API user sending this command did not check any preconditions correctly.

So firstly the user won't know something went wrong You can always use a different communication channel or implement a general "error handling" process manager.

(intentionally or by accident) Intionally - no need to waste time with replying accident - Why did the UI allow this then?

As commands are a fire and forget message it is particulary important that and API user (e.g. UI) checks all preconditions as good as possible. If there is a command failing due to business rule constraits, that is usually a bug in the UI (or API user in general). It pushes a lot of responsibility towards the API user yet usually this is enough for almost most of the cases. The only thing you'll then need to be worried about (expect for technical issues) is true concurrency issues. Depending on your domain these are either so rare to accept these or you can detect and handle these specifically (e.g. send out an email .. ). If you are insisting on waiting for a positive reply from the command (e.g. has been found) you can that too, by checking the side-channel mentioned before (e.g. insert handled commands in a table). But this pretty much defeats the purpose of CQRS and you might just want to go with event sourcing only.

On the other hand, loading an AR in the command handler is not overly expensive (in particular if you are using snapshotters). We have a few commands that indeed require a similar behavior too so we went for this route:

Yet the two important properties remain as they are: we don;t rely on the AR state or return values from the command handler. (Which is pretty much the cheap solution to what @codeliner suggested)

enumag commented 6 years ago

@fritz-gerneth I was hoping for a response from @codeliner but ok, I'll react to your post.

So you ARE actually checking that the IDs are valid by loading the aggregates, right? You just don't return the errors through direct response but through a different channel. That is an important detail for me to note and I do see some value in that but doesn't really matter in the context of my concerns.

Anyway this confuses me even more. @codeliner said to only check if the ID is valid in process manager or projection - either way it is AFTER the event was stored to EventStore. You however are checking them in command handler, meaning BEFORE the event is stored in EventStore. Which approach is correct then?


@codeliner Could you respond to my post above please?

fritz-gerneth commented 6 years ago

Those are two things and should have made the limitations I imply more clearly:

enumag commented 6 years ago

Oh ok. Then I can ask you about the main problems.

Firstly what exactly should I do in the process manager if the ARs ID does not exist? How should I handle projections? I mean only events that were already validated by the process manager should be projected, right?

Secondly if I don't check the existence of the related ARs then I have no guarantee that those ARs will be created later. That makes such events pretty much invalid and even storing such events seems useless. ES could easily become full of invalid unneeded data. How should I deal with that?

Thirdly if ES contains such invalid data, can I even consider it to be the "source of truth"?


I think I need a better insight to your workflow regarding storing events, checking for related IDs and projecting those events.

fritz-gerneth commented 6 years ago

@codeliner 's point was: make error handling an explicit part of the business process. Ask yourself how this would be handled in a offline world:

1) someone is submitting a (paper-based) form to a contractor with all his personal information attached. 2) the contractor later on finds out the information provided is invalid (e.g. references an non-existent person, not attached) 3) The contractor calls the person to clarify the incorrect information etc... 4) If the contractor cannot reach the person they cancel the contract

As you can see there is a small process involved on how to handle incorrect data. Bring this into an application you can have similar. Don't treat those things as application errors. Those are business process errors and should be handled as such. In this scenario the error is resolved manually by the contractor.

Another scenario on this:

1) someone books a seat at a conference, that seat is reserved 2) later on it turns out the provided data is incorrect 3) the seat reservation is canceled automatically (e.g. by a regular check / process manager)

Again, initially the seat is booked without validation of the relationship. Thinking in strong-enforced relations between ARs is contra-productive. Explicit relationships between ARs in general.

Secondly if I don't check the existence of the related ARs then I have no guarantee that those ARs will be created later. That makes such events pretty much invalid and even storing such events seems useless.

There is a reason why it is called 'eventual consistent'. If you are looking for strong consistency either the architecture is the wrong one (sometimes you just need strong consistency) or your aggregate roots are mis-aligned (the more often scenario). Nevertheless, those are not useless events at all. By making error handling part of the business process those events have the same value as any other.

Secondly if I don't check the existence of the related ARs then I have no guarantee that those ARs will be created later.

If they have a strong dependency on those ARs your AR boundaries are wrong. Forget about relational database models and tables, about Objects in OOP, .. An aggregate root is much more, most of all a consistency boundary for a particular problem.

For the contract example above - if the personal information indeed must be consistent at the time of the form submission - don't reference it but make it an explicit part of the contract itself. You can find this in real life too:

Thirdly if ES contains such invalid data, can I even consider it to be the "source of truth"? Yes. Invalid references are not invalid data as explained.

Summary: relations between ARs = weak relations. if you need strong relations your AR boundaries are wrong. both have it benefits and downsides. what you need depends on the model.

codeliner commented 6 years ago

@enumag I did not answer because I have nothing to add to what @fritz-gerneth said. His answers are very detailed. Listen to him and try to follow the suggestions e.g. the comparison to a paper-based process.

enumag commented 6 years ago

@fritz-gerneth It all comes down to our ARs and bounded contexts being most likely wrong despite our efforts and refactorings. That's not really surprising considering that for most of the team it's a first time building a system like this. Changing error handling to what you described doesn't seem beneficial while the ARs are incorrect either.

The thing is that I have no idea how to do it better. We've read quite a few articles on the topic but it seems to be more experience-based problem. The old system with relational model has a lot of relations everywhere so it's difficult to tell where the separations should be made. And I honestly can't tell whether a given relation is "strong" or "weak" because there is no such distinction in normal relational database - just foreign keys everywhere.

Would you guys be willing to help me find the bounded contexts and aggregates if I describe the core of our model? Of course it's off topic in this issue - should I ask in proophsoftware/es-emergency-call for instance?

codeliner commented 6 years ago

Would you guys be willing to help me find the bounded contexts and aggregates if I describe the core of our model? Of course it's off topic in this issue - should I ask in proophsoftware/es-emergency-call for instance?

Yes, let's try that. But it might take some time. Anyway, it could be nice reference for other teams struggling with the same problem. And yes please ask in the emergency-call repo.

enumag commented 6 years ago

Alright then, thank you. I'm closing this issue and will open a new one in the emergency repo. It will take some time to write it down and I also have a vacation coming up so it might take a week or two for me to open the issue.

codeliner commented 6 years ago

If possible, attach a draw.io visualization. Looking forward to the issue