Closed berkes closed 4 years ago
Should I keep a separate projection where the Aggregate can check for duplicates?
The way you currently have this modelled this sounds like the way to go, this is often referred to as a “command side projection”. You will however have a race condition. If the command side projector isn't fully up to date when a command comes in, you can get duplicates.
This can be handled in a few ways, one is the what you've described with your projector. The projector can handle removing duplicates. Another option is to have a reactor keep an eye out for duplicates and emit compensating events, PlaceDuplicateDetected
for instance. Then other projectors and handle that event as they see fit, removing the duplicate place for instance.
Taking a step back for a second, how does Place
work? You haven't mentioned aggregate_id
at all. What does once Place
represent? Is a single Place
meant to contain multiple PlaceAdded
events? If a Place
is only meant to have a single PlaceAdded
event then you might be able to use the aggregate_id
to handle this.
If place_id_builder.id
could be generated by the client before the command is sent you could have the aggregate ensure that it's only got a single PlaceAdded
event. This removes the need for the command side projection entirely and therefore removes the race condition. You could potentially use a UUID v5 to turn place_id_builder.id
into a valid UUID.
Thanks for your reply!
The UUIDv5 was unknown to me and it looks like the perfect solution here.
I could not use UUIDv4 (SecureRandom.uuid
) since that is random and unrelated to the set of attributes that make a Place unique^1. But encoding the string in a UUIDv5 (Digest::UUID.uuid_v5
) seems to work.
But, to summarize your feedback for future readers, I understand the "validate uniqueness" has several solutions, based on your (domain) needs.
PlaceCreated
or PlaceDuplicateDetected
event.
2.2. The projector determines at projection wether it is a duplicate and then issue a PlaceDuplicateDetected
event, or it proceeds inserting the data into a projection. (This is what I have ATM).uuid_v5
method as described above. Each has it's use-cases and its pro's and cons. It really depends on the case-at-hand wich one is more apt for your specific scenario, though. E.g. registering a new user and requiring to give feedback on whether or not the user-provided-email exists, would probably be handled best with case 3. But a subscription to a newsletter, where a duplicate email-address is of far less importance, would probably be best suited with case 2: in which the reactors or processors handle the duplicates async.
^1: A place's uniqueness is a combination of it's normalised name, the area(lat/lon+radius) and its category. E.g. MXQ4+M5;historic:monument;statueofliberty != MXQ4+M5;shop:museum;statueofliberty
: a complex "domain" problem, really. :)
I hope it is OK to answer questions about usage and general patterns here.
I have events that process GeoJSON
places
. It needs to avoid creating "duplicates" based on attributes. The exact details are domain-specific, so a Simplified example is given below.Here, the
place_id_builder.id
generates an String (not a UUID!) for a place based on heuristics such as the name, the kind of place, and the location (lat/lon). For the sake of the example, one can imagine aUserRegistered
that must check for already existingemail
attributes, for example.I don't see any API to find all
PlaceAdded
events wherebody.place_id = place_id
. I'm not certain that searching previous events for duplicate parameters is the right pattern at all.Currently, I've implemented this in the projector. Where it simply avoids inserting a record into the query database on duplicates.
The "places" projector now does:
Note that this can probably be implemented cleaner with a
catch
on Postgres unique constraints errors.Also, this seems clumsy if in case of duplication you want to emit another event, e.g. a
DuplicatePlaceIgnored
.This works. But the asynchronous nature prohibits me from sending the client back a HTTP or some other error. In my specific example, I't fine with that, but the example where an email must be unique, one would probably want to convey this to the user with a proper error/validation message.
How is this typically achieved? And how can
event_sourcery
help here? Should I keep a separate projection where the Aggregate can check for duplicates? Should aggregates know about projections at all? Should I search through past events instead? And if so, how do I achieve this with the setup of event-sourcery where event-bodies are un-indexed JSON "blurps"?Stackoverflow has an interesting answer on a similar case as well. Where the command is the one doing the checks.
https://stackoverflow.com/a/43613564/73673