lerna-stack / akka-entity-replication

Akka extension for fast recovery from failure with replicating stateful entity on multiple nodes in Cluster
Apache License 2.0
30 stars 1 forks source link

A RaftAcotor(Leader) could mis-deliver a ReplicationSucceeded message to a different entity #156

Closed xirc closed 2 years ago

xirc commented 2 years ago

Situation

  1. RaftActor on node A starts a replication for entity X with index=5 (for example).
  2. RaftActor on node A becomes a follower before the replication completes for some reason.
    • RaftActor on node B is the leader now.
    • The new leader has a Raft log with lastLogIndex=3 (for example).
  3. RaftActor on node B starts a new replication.
    • This replication contains a log entry (index=5, but an event is for entity Y), for example.
  4. RaftActor on node A receives AppendEntries from RaftActor on node B
    • This receiving truncates the log entry (index=5, an event for entity X) because the entry is conflicted.
  5. RaftActor on node B becomes a follower before the replication completes for some reason.
    • RaftActor on node A is the leader again.
  6. RaftActor on node A completes the ongoing replication
  7. RaftActor on node A will send a ReplicationSucceeded message containing the log entry (index=5, the event is for entity Y) to entity X.
    • It is because LeaderData.clients associates a ClientContext for entity X with index=5.

Possible Solution

Similar to https://github.com/lerna-stack/akka-entity-replication/issues/155