lerna-stack / akka-entity-replication

Akka extension for fast recovery from failure with replicating stateful entity on multiple nodes in Cluster
Apache License 2.0
30 stars 1 forks source link

🚨test: Increase retryInterval to prevent premature compaction #139

Closed negokaz closed 2 years ago

negokaz commented 2 years ago

MultiSnapshotSyncSpec requires compaction that occurs at the appropriate time. However, the current retryInterval of AtLeastOnceComplete.askTo can upset it. Too short retryInterval will cause premature compaction because also read-only operations create new LogEntry(NoOp) to guarantee consistency.

The following shows us that increasing retryInterval can reduce NoOp: (I extracted and formatted AppendEntries sent by the leader to the follower from the debug log)

retryInterval = 1s

AppendEntries(
    NormalizedShardId(0),
    Term(2),
    member-1,
    0,
    Term(0),
    List(
        LogEntry(1, EntityEvent(None,NoOp), Term(2)),
        LogEntry(2, EntityEvent(Some(NormalizedEntityId(0)),NoOp), Term(2)),
        LogEntry(3, EntityEvent(Some(NormalizedEntityId(0)),NoOp), Term(2)), 
        LogEntry(4, EntityEvent(Some(NormalizedEntityId(0)),NoOp), Term(2)), 
        LogEntry(5, EntityEvent(Some(NormalizedEntityId(1)),SetEvent(1)), Term(2)), 
        LogEntry(6, EntityEvent(Some(NormalizedEntityId(0)),NoOp), Term(2)), 
        LogEntry(7, EntityEvent(Some(NormalizedEntityId(2)),SetEvent(2)), Term(2)), 
        LogEntry(8, EntityEvent(Some(NormalizedEntityId(3)),SetEvent(3)), Term(2)), 
        LogEntry(9, EntityEvent(Some(NormalizedEntityId(4)),SetEvent(4)), Term(2)), 
        LogEntry(10, EntityEvent(Some(NormalizedEntityId(5)),SetEvent(5)), Term(2)), 
        LogEntry(11, EntityEvent(Some(NormalizedEntityId(6)),SetEvent(6)), Term(2)), 
        LogEntry(12, EntityEvent(Some(NormalizedEntityId(7)),SetEvent(7)), Term(2)), 
        LogEntry(13, EntityEvent(Some(NormalizedEntityId(8)),SetEvent(8)), Term(2)), 
        LogEntry(14, EntityEvent(Some(NormalizedEntityId(9)),SetEvent(9)), Term(2)), 
        LogEntry(15, EntityEvent(Some(NormalizedEntityId(10)),SetEvent(10)), Term(2))
    ),
    15
)

retryInterval = 2s (reduces the number of NoOp)

AppendEntries(
    NormalizedShardId(0),
    Term(1),
    member-2,
    0,
    Term(0),
    List(
        LogEntry(1, EntityEvent(None,NoOp), Term(1)),
        LogEntry(2, EntityEvent(Some(NormalizedEntityId(0)),NoOp),Term(1)),
        LogEntry(3, EntityEvent(Some(NormalizedEntityId(0)),NoOp),Term(1)), 
        LogEntry(4, EntityEvent(Some(NormalizedEntityId(1)),SetEvent(1)),Term(1)), 
        LogEntry(5, EntityEvent(Some(NormalizedEntityId(2)),SetEvent(2)),Term(1)), 
        LogEntry(6, EntityEvent(Some(NormalizedEntityId(3)),SetEvent(3)), Term(1)), 
        LogEntry(7, EntityEvent(Some(NormalizedEntityId(4)),SetEvent(4)), Term(1)), 
        LogEntry(8, EntityEvent(Some(NormalizedEntityId(5)),SetEvent(5)), Term(1)), 
        LogEntry(9, EntityEvent(Some(NormalizedEntityId(6)),SetEvent(6)), Term(1)), 
        LogEntry(10, EntityEvent(Some(NormalizedEntityId(7)),SetEvent(7)), Term(1)), 
        LogEntry(11, EntityEvent(Some(NormalizedEntityId(8)),SetEvent(8)), Term(1)), 
        LogEntry(12, EntityEvent(Some(NormalizedEntityId(9)),SetEvent(9)), Term(1)), 
        LogEntry(13, EntityEvent(Some(NormalizedEntityId(10)),SetEvent(10)), Term(1))
    ),
    13
)