asynkron / protoactor-dotnet

Proto Actor - Ultra fast distributed actors for Go, C# and Java/Kotlin
http://proto.actor
Apache License 2.0
1.71k stars 285 forks source link

ApplySnapshot() is incorrectly called after PersistSnapshotAsync()? #381

Closed ryanjshaw closed 6 years ago

ryanjshaw commented 6 years ago

Why do Persistence.PersistEventAsync() and Persistence.PersistSnapshotAsync() both _applySnapshot() after calling snapshotStore.PersistSnapshotAsync()?

Aside from ApplySnapshot() being apparently redundant (I just created the snapshot from the current state, what’s there to apply?), if ApplySnapshot is expensive then performance is affected?

Is this by design? I guess it’s following the same pattern with event persistence, but snapshot and event persistence are two different things in my mind.

I’ll give a concrete (albeit heavily simplified) example. I am reading messages from a Kafka queue. I want to track which messages are still “Processing” (i.e. no ACK or NACK back from system components), and which are “Processed” or “Errors”. I want to be able to restart the service arbitrarily and pick up roughly from where I left off. I use both Event Sourcing and Snapshotting:

Now because ApplySnapshot() is always called when PersistSnapshotAsync() is called, I end up retelling my subscriber all Kafka messages in the Processing list every single time I create a Snapshot. This slows execution significantly, even if the idempotent processing on the subscriber side ensures no side-effects.

Am I doing this wrong?

raskolnikoov commented 6 years ago

When ApplySnapshot is called you will get two events, RecovedSnapshot or SnapshotPersisted. Can't you make use of a switch statement here and do the right decision? Example: https://github.com/AsynkronIT/protoactor-dotnet/blob/dev/examples/Persistence/Persistence/Program.cs#L75

ryanjshaw commented 6 years ago

Thanks, I wasn't aware there were subclasses, that certainly works. (I've been using a Version field in the state to do the same thing.) But I'm still unclear as to why you would Apply a PersistedSnapshot? Wouldn't a separate callback be more appropritae?

It's very easy to shoot yourself in the foot and miss this as I did - I was processing 10s of thousands of events, so the issue was hidden in a lot of logging and idempotent processing meant duplicates were ignored. It was only when I dug deeper into the performance I realised what was happening. I don't think I'll be the last person to be bitten by this design decision. Proto.Actor is one of my favourite libraries, almost everything else is "obvious, simple, and smart by default" but this feels wrong. Just an opinion, though, happy to close if there is no consensus.

EDIT: My performance went from a few minutes for 1.5k Kafka messages to 3 seconds when I realised what was happening :) With optimisation I'm sure this can be much better.

tomliversidge commented 6 years ago

In ApplyEvent() when KafkaMessage: add the kafkaMessage to my Processing list so that I can track the outstanding delivery Tell() the kafkaMessage to my queue subscriber handle ACKs and NACKs from my system components and update the Processing, Processed and Error lists accordingly

I'm a bit confused by this - does all this happen each time you reload events from the underlying storage? The event sourcing systems I've seen would normally separate out state-change from actions - https://softwareengineering.stackexchange.com/questions/354363/how-do-i-deal-with-side-effects-in-event-sourcing i.e. don't perform any side-effects in the replay stage, only set state.

https://github.com/gregoryyoung/m-r/blob/master/SimpleCQRS/Domain.cs is another example

Tell()s and ACKs/NACKs etc look like side effects. The same principal applies to loading a snapshot - it's just current state.

However... i agree it is confusing (to me at least) why we applySnapshot after saving it, as like you say, we have just passed it into the method

raskolnikoov commented 6 years ago

After looking at it I also thinks this is wrong. I'll make a PR to remove the ApplySnapshot call on these lines:

https://github.com/AsynkronIT/protoactor-dotnet/blob/dev/src/Proto.Persistence/Persistence.cs#L136

https://github.com/AsynkronIT/protoactor-dotnet/blob/dev/src/Proto.Persistence/Persistence.cs#L146

ApplySnapshot should only get called upon start/restart on the actor.