etcd-io / raft

Raft library for maintaining a replicated state machine
Apache License 2.0
666 stars 164 forks source link

raft: fix out-of-order terms #47

Open pav-kv opened 1 year ago

pav-kv commented 1 year ago

This commit adds a couple invariant checks which were silently ignored and resulted in incorrectly setup tests passing, and incorrect messages exchanged.

Property 1: for a MsgApp message m, m.Entries[i].Term >= m.LogTerm. Or, more generally, m.Entries is a slice of contiguous entries with a non-decreasing Term, and m.Index+1 == m.Entries[0].Index && m.LogTerm <= m.Entries[0].Term.

This property was broken in a few tests. The root cause was that leader appends out-of-order entries to its own log when it becomeLeader(). This was a result of incorrectly set up tests: they restored to a snapshot at (index=11,term=11), but became leader at an earlier term 1. Then it was sending the following, obviously incorrect, MsgApp: {Index=11,LogTerm=11,Entries={{Index=12,Term=1}}.

The fix in tests is either going through a follower state at the correct term, by calling becomeFollower(term, ...), or initializing from a correct HardState in storage.

Property 2: For a MsgSnap message m, m.Term >= m.Snapshot.Metadata.Term, because the leader doesn't know of any higher term than its own, and hence can't send a message with such an inversion.

This was broken in TestRestoreFromSnapMsg, and is now fixed.

pav-kv commented 1 year ago

@tbg @ahrtr PTAL