Research: Put ValueObjects inside Domain Events (for Event Sourcing) or not?

MateuszNaKodach / SelfImprovement

This project has some sample code for my personal learning purpose. Things which I've learnead are collected as issues here: https://github.com/nowakprojects/SelfImprovement/issues

107 stars 17 forks source link

Research: Put ValueObjects inside Domain Events (for Event Sourcing) or not? #5324

Open MateuszNaKodach opened 1 year ago

MateuszNaKodach commented 1 year ago

Don't use value objects in your domain events. The last point might require some elaboration. The Value Object pattern in DDD doesn't only require for those objects to be immutable and implement equality by value. The main attribute of a value object is that it must be correct. It means that you can try instantiating a value object with invalid arguments, but it will deny them. This characteristic along forbids value objects from being used in domain events, as events must be unconditionally deserializable. No matter what logic your current domain model has, events from the past are equally valid today. By bringing value objects to domain events you make them prone to failure when their validity rules change, which might prevent them from being deserialized. As a result, your aggregates won't be able to restore their state from previously persistent events and nothing will work.

https://eventuous.dev/docs/domain/domain-events

MateuszNaKodach commented 1 year ago

screencapture-groups-google-g-dddcqrs-c-LNNIDqT7Kvc-2023-05-13-01_55_51

https://groups.google.com/g/dddcqrs/c/LNNIDqT7Kvc/m/Ifx6XKm-tKcJ

Greg Young said - don't do that.

MateuszNaKodach commented 1 year ago

 am skeptical that the code and process you describe above is generating enough business value to pay for itself. Working functionality is an asset, but lines of code are a liability. It seems like we spend our careers writing the same code again and again, doing push-ups shuffling data back and forth between one kind of object in another, between one tier and another. So I strongly resist any development mechanism that requires doing much of that by hand.  Not so much on philosophical grounds (If creating this kind of duplication creates value, let’s do it!), but because I feel that doing it leaves me vulnerable to being steamrolled by a competitor who finds a way not to.

A value object, that’s just a data type, that’s just a way of grouping fields semantically. There are issues including the ones brought up in that linked thread. If we choose to use value objects as part of your events, the exact shape of those value objects (the names and types of the fields) the come part of the “schema” of the event history, and therefore need to be immutable/versioned/casted just like events themselves. But is that worse than exploding value objects into yet another different set of classes to represent the same things, in the event schema?

In the example code snip above, we just have to decide whether a Money is part of the event schema or not. If it is, treat it as such, and if we change its definition the future we will need to write a way to cast it and so on. If it’s not, then we need to do all the shuffling above every time we use one. Something like Money clearly falls on the “just use it” side of things, for me anyway. Your mileage may vary of course, depending on the likelihood of the shape of a Money changing in the future for the problem domain. If it’s likely we will need to change that definition, then we be signed up to create a MoneyV2 or whatever, ouch.

I really like Greg’s notion of defining the event schema in some cross language form, then generating the definitions we need for whatever languages we are going to work with those events. I’ve done that myself spanning JavaScript and Java, and some other groupings that I’ve forgotten the details of from past projects. When defining such a schema, the value objects we plan to use as part of events belong in that same schema mechanism, while value objects that we're using only outside of the events don’t need to be there.

https://groups.google.com/g/dddcqrs/c/0Y_tGiIH7gI/m/mJ2qxakREwAJ

MateuszNaKodach commented 1 year ago

One of the things that you are doing with events is organizing a bunch of information (in memory) that you are going to copy to somewhere else (persistence).
One of the things that Value Objects do -- because they are "objects" -- is encapsulate a data structure.
So ideally nobody other than the event itself should have to care about what the internal representation of the event's data structure is.
Ideally, changing the implementation details of that data structure should not impact any code outside of the module that is the event itself.
But for the event to be useful, as a source of information, you need some way to get the information back out of it.
And close to the boundary, that's going to need to be some general purpose representation of information, unless you are building bespoke magic all the way down

ex: we can ask the event for a copy of its information, and get a byte buffer that contains a UTF-8 encoded JSON document, or whatever.
Riddle: how do we get the integer, which is the general purpose representation of a UserId in your model, into that UTF-8 encoded JSON document?
Related: when we are rehydrating the event from a general purpose representation, how do we get all the information back into its in memory representation? 

As far as I can tell, our in memory representations are transient; unload the process using in-memory-representation-v1, load the process using in-memory-representation-v2, and everything should be exactly the same as before.
In that context, you're really just looking to reduce the impact of changes to that in memory representation.
For example, if we later decide that that the in memory representation of user Id should be a long, or a string, instead of integer, how much code do we have to change?

So what are the right "what we wants" for your context?

MateuszNaKodach commented 1 year ago

https://discord.com/channels/514783899440775168/762671777372962837/1004530353463103508

Question: Question, for those of you who've been doing this a while, do you prefer domain + DTO/document types for events or just a single serialisable event type? Not after the generic medium post answer on the pros/cons of each but whether you've had the supposed benefits/detriments actually materialise.

Always convert to DTOs. Will save you from problems as you need to make changes on your events. And it removes the need for fancy serializers.

I just wrote some conversions (or github copilot did ❤️) and doing a quick sanity check. Hardly ever see examples doing so and I don't think i've ever seen good examples of testing conversions (with FSCheck it's pretty painless, especially with an F# DU).

Yes, with fscheck, you can test a single property: ShouldRoundtrip x = let result = x |> toDTO |> serialize |> deserialize |> ofDTO result = x

Personaly, I don't mind using serialisable DTOs for events, I find the need to custom serialisation as a default tedious. I think that it's a matter of personal preferences and tradeoffs we prefer to take.

That's because you're probably not modeling the events with e.g. ValueObjects as opposed to primitives of the programming language / base library. The need becomes more apparent that way. But I suspect it's also bound to the programming language - e.g. often seen it used with F# but not so much with C#.

Indeed, there are various factors into that. Of course, I sometimes go with the explicit (de)serialisation, but typically I'm doing that when I update my event model. Of course, there's a risk that accidentally someone changes namespace or type definition, and the whole babel tower falls, but as you said, that depends on the tooling. Plus, no matter which approaches someone chooses, I recommend using contract tests to catch such drift.

Yes it limits you on the types you can use, for instance sets, primitive wrapper types, unions etc.

Indeed. For instance, in TypeScript probably the explicit serialise/deserialise may be also a preferred way because of the limitations of the JS serialisation (e.g. BigInts, Dates, etc.)

@yreynhout I normally consider using VOs in contract objects (commands and events) an anti-pattern due to serialisation issues as well as "things change" issue with constructing constraints. do you have different experience with that? I had a couple of heated discussions about this, but in general people agree.

MateuszNaKodach commented 1 year ago

Follow up:

I may not be explaining myself well enough then. What I mean is that there are 2 representations of an event, one in the domain model and one in what I call the persistence model. The domain model representation of an event does not concern itself with versioning (no suffixes, nor does it care about previous versions as long as things are backwards compatible), actively uses any value objects to cut down on the translation tax from primitives to value objects during replay (it just shifted some place else), does not concern itself with being encoding / serialization friendly nor what the encoding even is. The persistence model representation of an event does concern itself with versioning, only uses data types that make sense for the target encoding, and is encoding / serialization friendly. Upon reading we map from the persistence to the domain model representation, upon appending we map from the domain to the persistence model representation. Translation is thus confined to the place where I/O happens. I realize this is not everybody's cup of tea and some may even wonder what the ROI is (which usually only manifests itself once we run into versioning issues). This should not be confused with public and private events mapping for integration / insulation purposes - effectively it's two representations of a private event.

Yeah, in C# it might not be so much visible, but for instance in ESDB NodeJS client you either need to use primitives like string instead of Date or bigint because the default JSON.parse is not handling it properly.
So having stringly typed events in domain is not great.
Not speaking about more advanced typing.
So the choice is to
- live with that,
- write something generic, 
- or do explicit mapping

Yup, I still have to fight people that think teaching reflection based serializers how to deal with custom types is not the same trouble we had with ORMs (albeit of a smaller scale).

In C# it might not be so much visible, as the popular serialisers are rich in the sense that they're not only doing (de)serialisation but also mapping.

I'm fine with using primitives in events or using such richer serialisers, as long as we know that we're doing tradeoff selecting them.

Alexey: I have never seen an implementation like that, would very much like to look at the code

Bartelink: The bottom line is that the persisted events will outlive any individual bit of code in any language The persistent form of an event represents a contract Contracts that rely on implicit magic going on are a pain when it comes to someone unfamiliar with the system troubleshooting or trying to understand what's going on in either when they're coming in fresh, or when there's troubleshooting to be done If it makes sense for you to skip a layer of mapping and borrow the definition and/or put in some magic to make the contract form palatable in a Domain Model, fine - but not at the cost of making the event contract messy and/or having strange dependencies That's not dissimilar to how one might whip up something that passes out the domain model types to a view model with some implicit transformation making it palatable (an AutoMapping if you will) - that's fine for a while, but the Event Contract and the View contract are things that people need to be able to communicate/understand, whereas the Domain Model is in the middle and a moving

(An F# example being that tuples and SCDUs have no place in event contracts, whereas UMX tagged primitives are far more debatable)

MateuszNaKodach commented 1 year ago

thinkbeforecoding — 08/05/2022 11:45 AM I typically recommend what @yreynhout said. There is another possibility: Write serialization/de serialization functions directly using reader/writer. It removes the instanciation of intermediate DTO objects. However it's safer in languages where all fields are required (as you add a field to your domain event, compiler complains), and it is easier on functional languages where you can use combinators .

Oskar Dudycz — 08/05/2022 11:47 AM Write serialization/de serialization functions directly using reader/writer. That's one of the options, but it has a significant issue, it's stringly typed and and tedious, so vulnerable for copy/paste bugs. thinkbeforecoding — 08/05/2022 11:47 AM So instead of event -convert-> DTO -serialize-> bytes You have Event-hand serialize->bytes @Oskar Dudycz not with strongly typed functional combinators Oskar Dudycz — 08/05/2022 11:48 AM Example of what I mean: https://github.com/oskardudycz/EventSourcing.NetCore/blob/main/Sample/EventsVersioning/EventsVersioning.Tests/Transformations/MultipleTransformationsWithDifferentEventTypes.cs#L34

MateuszNaKodach commented 1 year ago

yreynhout — 08/05/2022 6:25 PM I've done this in the past, where I'd read a namespace or version as one of the first few properties and internally switched the reader behavior, writer behavior was just mutated since I didn't want the ability to write older versions Savvas Kleanthous — 08/05/2022 8:26 PM I've done that too and I like this approach, although I use metadata to store version and namespace/type (amongst other info). I can make decisions on how to process the event while treating the event as a simple blob. I find that really useful because it allows simple deserialization for some contracts and I can override when thing become more complicated to have a custom deserializer.

Alexey Zimarev — 08/06/2022 12:51 PM Don't see how it will work, say, with Protobuf when the model class is generated by protoc thinkbeforecoding — 08/06/2022 1:52 PM I've done this with protobuf, then I check that hand made serialization round-trip with model generated by protoc using property based testing. My impl is 5x faster 😁

MateuszNaKodach commented 1 year ago

https://eventuous.dev/docs/domain/domain-events/

Don't use value objects in your domain events.
The last point might require some elaboration. The Value Object pattern in DDD doesn't only require for those objects to be immutable and implement equality by value. The main attribute of a value object is that it must be correct. It means that you can try instantiating a value object with invalid arguments, but it will deny them. This characteristic along forbids value objects from being used in domain events, as events must be unconditionally deserializable. No matter what logic your current domain model has, events from the past are equally valid today. By bringing value objects to domain events you make them prone to failure when their validity rules change, which might prevent them from being deserialized. As a result, your aggregates won't be able to restore their state from previously persistent events and nothing will work.

MateuszNaKodach commented 1 year ago

W commandach nie, Vernon nie uzywa: https://github.com/VaughnVernon/IDDD_Samples/blob/master/iddd_identityaccess/src/main/java/com/saasovation/identityaccess/application/IdentityApplicationService.java

https://stackoverflow.com/questions/56059592/use-value-object-in-command-and-event

MateuszNaKodach commented 1 year ago

Ogólnie ten wpis nawet do mnie przemawia:

Why we Avoid Putting Value Objects in Events https://buildplease.com/pages/vos-in-events/

From a technical-only perspective, yes I see what you're getting at. But, a Value Object that is always valid when serialized into primitive objects is the constraint here. From a Domain Modeling perspective, if the purpose of a Value Object is to protect invariants (with guard claueses, etc) - by putting them in your Event you are saying that the Event also protects invariants which is not the responsibility of an Event. The Event is representing a fact in history that occurred (and who's invariants had already been checked as valid). I don't see anything wrong with grouping data within an Event, i.e. having classes that change over time and dealing with that in your serialization strategy. But what you're communicating when making an Event dependent on a true Value Object is different - you are saying that the Event needs to be refactored along with the Value Object which violates the immutability of the Event. If you're sticking to the rule of immutable Events, then for every refactoring of said Value Object you would need an Event_v2, Event_v3, etc.

MateuszNaKodach commented 1 year ago

Leaking Value Objects from your Domain https://codeopinion.com/leaking-value-objects-from-your-domain/

Unzaje, ze za kontrakt, jak dla mnie wewnatrz bondaury moga bys cpoko.

MateuszNaKodach commented 1 year ago

CodeOpinion: https://codeopinion.com/leaking-value-objects-from-your-domain/

I moje komenatrze pod filmikiem. Ciekawe wykorzystanie do notyfikacji. Też bardziej skomplikowane serializacje, jeśli uzywam jakiegos out-of-process busa.

MateuszNaKodach commented 1 year ago

O jaka serializacje tutaj chodzi, skoro Alexey wlasnie dlatego jest przeciwny Vo w eventach?

MateuszNaKodach commented 1 year ago

From domain events to infrastructure - thinking out loud about possible approaches I don’t hate

https://blog.codingmilitia.com/2023/05/16/from-domain-events-to-infrastructure-thinking-out-loud-about-possible-approaches-i-dont-hate/

MateuszNaKodach commented 1 year ago

Poor data type choices for information carrying — The domain event should carry information via [primitive data types](https://condor.depaul.edu/sjost/nwdp/notes/cs1/CSDatatypes.htm) only. If the event had [value objects](https://martinfowler.com/bliki/ValueObject.html) that reside in the originating domain as part of its payload, those objects may very well not exist in the consuming domain. This makes the process of deserializing the event problematic, and promotes domain logic leakage (Avoid this!!!)

https://www.ledjonbehluli.com/posts/domain_to_integration_event/