rbanks54 / microcafe

Sample code to show microservices using NET, in the context of a cafe
119 stars 50 forks source link

Snapshotting with event store #6

Open rbanks54 opened 8 years ago

rbanks54 commented 8 years ago

In the domain services, to avoid long rebuilds of model state show how snapshotting can be used.

For the sample we might do snapshotting at every 5 events.

dasiths commented 7 years ago

Would the best design practice be using a different storage provider for snapshots (Like Redis) that caters for a quick lookup by a key?

rbanks54 commented 7 years ago

Not really. It's better to keep using EventStore as a single repository, with a separate stream for the snapshots.

i.e. if we have a event stream for a domain object such as product-xyz, we could have an event stream named product-xyz-snapshot.

In the snapshot stream we store the current of the domain object and the version of the stream as at the time the snapshot was made.

To rehydrate objects we:

Does that make sense?

dasiths commented 7 years ago

Yes it does make sense.

I've implemented my SnapshotStorageProvider in a very similar way. https://github.com/dasiths/NEventLite/blob/master/NEventLite%20Storage%20Providers/EventStore/EventstoreSnapshotStorageProvider.cs

But I replaced it with Redis and got better read times. The only drawback with the Redis cache was how my implementation always overwrote the last entry with the new one. I wasn't worried about contention as I always stored the version number too. Is there a specific need for storing past snapshots? Is that for cases when we have to rebuild to a state faster at a specific past date/time?

rbanks54 commented 7 years ago

Having the snapshots and event streams stored in different places means you have to think about transactional boundaries or accept potential data loss. Of course, the loss of a snapshot isn't a problem, really, given it's just another read model and thus easy to rebuild. Plus the next event for the domain entity would trigger the creation of a new snapshot.

The other thing to consider is when rebuilding state you query two separate data sources, so your code will possibly be a little more complex, and you're making two database connections instead of one.

Yes, multiple snapshots makes it easier to calculate state at a point in time, but there's not many domains where that ability is a feature so it might not be necessary for you. If you do need it, you could still support it with redis by storing a SET (i.e. collection) of snapshots related to a domain entity.

feanz commented 7 years ago

Could you not just add snapshots to the current stream as an event. You're going to have to come up with some sort of snap shot event anyway I assume if you use separate streams or a single stream. Then you would just read back to the last snapshot and replay the events upto the snapshot over the top.

What do you think?

dasiths commented 7 years ago

This introduces contention issues afaik. Calculating the snapshot and storing it in the right place in the stream require locking. Using a separate stream we just store the snapshot with version number and then read all events from that number forward to rehydrate. This way the second stream (snapshot stream) doesn't suffer from contention issues/locks.

feanz commented 7 years ago

Cool thanks for the info. So I guess the million dollar question is when you apply snapshots and how. Out of band batch process or inline with updates to the domain model. Usually as with most things I'm guessing the answer to this is it depends 😄

dasiths commented 7 years ago

Have a look here if you're keen https://dasith.me/2016/12/31/event-sourcing-examined-part-2-of-3/

feanz commented 7 years ago

Thanks man very useful info.

rbanks54 commented 7 years ago

@feanz @dasiths If I assume "current stream" means "aggregate root's stream", then snapshotting that way is a bad idea for a few reasons:

  1. Snapshotting is an optimisation for the application, not a domain event for the aggregate root object. It doesn't belong in the aggregates event stream.
  2. If I want to rebuild my aggregate from the last snapshot, I have to scan backwards through all the events until I find the last snapshot, then replay forward again.
  3. If I wanted to replay all the events for an aggregate root (bug fixes, new read models, etc) then I need to ensure I skip the snapshots during replay.
  4. Let's say there's a bug fix with event state that caused snapshots with incorrect state to be saved. I can't fix those snapshots in the event stream because event streams are immutable. I'd have to dump the existing stream and recreate it from scratch, recreating new snapshots as I go.

The best approach is to store a separate snapshot stream for an aggregate root, similar to an event stream. Rebuilding state simply involves reading the last snapshot in the snapshot stream and reading the last part of event stream to get events that have occurred since the last snapshot.

If you have a bug that requires you to rebuild state, simply delete the snapshot stream. The next event that occurs on the aggregate should then trigger a new snapshot, which will have bug fixes applied.

As to how often you snapshot? It's somewhat up to you and your performance criteria, and how volatile and aggregate root is (i.e. how often new events occur for an object). Maybe start with snapshotting every few hundred events and seeing if that helps or not. Adjust up or down to suit the performance profile of your application.

When you save an event to storage you should be able to create a new snapshot (if applicable) in the same transaction.

gromas commented 4 years ago

Hi, my two cente for the problem.

I have two different streams one for domain events and one for aggregate snapshots. Moreover I have one "system stream" whitch describes both "events" and "snapshots" streams as "aggregate root event source" and adds "event stream rotation" logic for decomposing both streams to timeslices. When I need restore application state then I read system stream first and resolve last events stream uri and snapshots stream uri then read latest snapshot slice and restore aggregate state from the snapshot, then I read latest events slice and restore full aggregate state. By checkpoint I close event stream and create new one, then execute snapshoting service which generates snapshot based on previous snapshot and events was published before the checkpont. Then snapshot service publishes newly generated snapshots as events to snapshot stream and write to system stream than snapshot stream uri was changed. I generate new snapshots for 1 hours based interval and rotate events stream each 120 seconds while processing over 9mlns domain events per minute.

PS: sorry for bad english )