Closed yossigo closed 6 years ago
This is a good fix. It matches the semantics of the raft paper better. Confirmed in virtraft that nothing breaks.
Hello! I found this fix may cause raft_recv_appendentries_response()
to send an empty retry appendentries. The main reason is that stale msgs are not all filtered out.
Consider this trace in a 3-server cluster:
S1
becomes Leader (term 1) -> S1
sends heartbeat msg m1
to S2
-> S1
restarts and become Follower -> S1
times out and starts a new election (term 2) -> S2
votes for S1
and updates term to 2 -> S1
becomes Leader again (term 2) -> S2
receives msg m1
and replies msg m2
: [term:2, success: false, current_index: 0]
because of smaller term -> S1
receives msg m2
, but m2
is not filtered out as a stale msg, and S1
replies an empty retry appendentries.
However, this is not even a bug. I just think it is inconsistent with the semantics that the program wants to express.
Errors were ignored if node's match_idx==next_idx-1, I assume (??) as a precaution against silently feeding a node that is not monotonic.
However this does not consider the case where send_appendentries() relies on snapshot_last_idx which may be greater than the node's index.
@willemt can you think of unwanted side effects for this?