Aggregate Root defaults correlation_id to uuid

orien commented 7 years ago

I propose that if a correlation_id is not provided to an event applied on an aggregate the new event's uuid will be used in its place. I think this forms a sensible default. Individual systems remain free to provide a correlation_id based on web request/session id or the like.

andrewgr commented 7 years ago

Hi @orien. Using event uuid as a default value for correlation_id introduces coupling between those two properties. On the one hand, I'm not sure it is a problem. On the other hand, is this coupling necessary? Maybe the default value for correlation_id can just be another uuid v4 generated by SecureRandom.uuid?

stevehodgkiss commented 7 years ago

Is this better than a default of nil? A correlation ID is meant to correlate with something, and if there's nothing else capturing the ID then I'm not sure it's worth capturing anything... I think I'd prefer to have a nil value over repeating the events uuid.

grassdog commented 7 years ago

Advantage of defaulting it to a value is that you do have something to trace back with in the event you need to. If we default to nil we won't have this.

I don't feel strongly either way on whether it should be the originating event UUID or a new UUID.

stevehodgkiss commented 7 years ago

something to trace back with

Could you elaborate this scenario further @grassdog (maybe an example)? In the cause of causation_id, event.uuid would be the value used to trace back.

grassdog commented 7 years ago

So say I have some event in the store that was emitted by some reactor and I'd like to get a view on how it got there you could query the event store with the correlation_id on that event (which after this change should be populated with a value) and see all of the related events.

You can definitely use causation_id and trace back a piece at a time also.

Main thing to pick apart from me is that correlation_id in this context is primarily for finding all events that belong to a single "process".

Does that make sense?

stevehodgkiss commented 7 years ago

Yeah I see what you mean. The reactor would need to ensure it carries across the correlation_id when emitting events for that to work and it would need a value in correlation_id to begin with.

grassdog commented 7 years ago

Yep, definitely. It'd be nice if we could ensure that is done on all events emitted to the store in ES 🤔

stevehodgkiss commented 7 years ago

Taking a step back, I'm not sure correlation_id is something an aggregate should be concerning itself with. I imagined this would be set in the command handler/repository with something like repository.save(aggregate, correlation_id: blah). Thoughts?

stevehodgkiss commented 7 years ago

In thinking about issue https://github.com/envato/event_sourcery/issues/139 I imagined the aggregate only dealing with the Event body essentially, not having access to correlation_id's or anything that's only available after persisting (sequence ID, created_at).

grassdog commented 7 years ago

Yeah, moving it to the repository probably makes more sense. It does feel more like plumbing to me.

I think it is worth considering the other use case for correlation IDs that we often see. That is when clients need to provide a correlation ID for the purpose of identifying work they've already processed I'd see that as a separate field inside of the event body and something that the aggregate would care about but not something that ES itself would propagate around the place. Does that make sense?

andrewgr commented 7 years ago

Actually, @stevehodgkiss raised an important point. I also think that aggregate shouldn't be concerned with correlation_id since correlation_id doesn't carry any info about the aggregate state.

I also agree that Repository is a better place to set correlation_id in a way that is completely independent from any aggregate concerns.

A couple of months ago @stevehodgkiss and I discussed an implementation of a "smart event sink" (in the current implementation of that would be Repository) that could be configured with correlation_id and/or causation_id and any other ids to use those ids to set them on events it persists. Something like this:

event_source = EventSource.new
event_sink = EventSink.new(correlation_id: '123')
repostory = Repository.new(event_source, event_sink)
repostory.save(aggregate.history)

stevehodgkiss commented 7 years ago

@grassdog for that case I think it'd better to use the correlation_id column over a specific field in the event body. Client gem's can provide a correlation_id and we'd use that value in the correlation_id column & populate the projection's correlation_id from it.

stevehodgkiss commented 7 years ago

@andrewgr I'm not sure EventSink is the place for that. I think of an event sink/source as more of a concept or label for an event store and they wouldn't have any additional code like this, it's more enforcing that this event store is where I read from and this is where I write to. Perhaps "unit of work" would be a better name for that, not sure about that though.

andrewgr commented 7 years ago

@stevehodgkiss yeah, event sink doesn't have to be a separate class. Having an event store to read from and to write to is just fine for us.

grassdog commented 7 years ago

Yep, so I vote for the repository as the place to set and propagate the correlation ID. It should be settable by clients in case they'd like to use it to track work they've already done.

orien commented 7 years ago

Thanks for you thoughts on this folks. Closing…

envato / event_sourcery

Aggregate Root defaults correlation_id to uuid #171