Closed yfei-z closed 2 months ago
The test https://github.com/jgroups-extras/jgroups-raft/blob/6f3a9ee37fd30f31527c89a7ecc0f50240ad8061/tests/junit-functional/org/jgroups/tests/election/ViewChangeElectionTest.java#L55 was supposed to cover this scenario. I'll take a look today and get back. Thanks for reporting!
We have different blocking points, I block the election thread just before raft.setLeaderAndTerm(leader, new_term);
, and you block the message sending where after the leader has been applied locally.
I've looked into it and was trying to find a solution. The fact the block is the setLeaderAndTerm
makes the difference. Likely, the solution will (as you commented on the other issue) serialize the handling of view updates. Otherwise, we calculate the effect of the view change without a final result in place. In this example, if the voting thread is pending, we have no_change,
but if the thread has finished, it is leader_lost.
.
Since we can't delay the view processing, we can chain a CompletableFuture
with a dedicated single-threaded executor for handling the events in order. We'll need some updates on the voting process to complete the future and notify the chain. I would avoid having a thread polling a queue since view changes are not supposed to occur frequently. Changes on the test side are necessary when checking if the voting thread is running since it would run asynchronously and not after processing the view.
I think just make sure the voting thread has processed the latest view before it's stopped, and the view change event thread don't try to stop the voting thread because there is no guarantee. I just made a pull request, hope it will give you some idea.
Since unset the leader only depend on the view change event of lost majority, so if a participant node got the delayed elected leader message after the view change event then it will keep the leader without majority reached. After the PR, the coordinator works fine but didn't cover this scenario.
@Test
void bug4() throws Exception {
String clusterName = "node-cluster";
List<RaftNode> nodes = raftChannels("A,B,C,D,E", null, t -> configProtocol(t, "raft.ELECTION", ELECTION.class))
.stream().map(RaftNode::new).toList();
try (var a = nodes.get(0); var b = nodes.get(1); var c = nodes.get(2)) {
ELECTION election = a.getCh().stack().findProtocol(ELECTION.class);
election.getSendLeader().block(); // block the election thread before apply the elected leader
a.getCh().connect(clusterName); a.untilCoord(3); // A become the coordinator
b.getCh().connect(clusterName);
c.getCh().connect(clusterName); // majority reached, start a voting thread
election.getSendLeader().untilWaiters(1, 3); // election thread has being blocked
c.getCh().disconnect(); // C left
a.untilMembers(2, 3); // view changed in A
b.untilMembers(2, 3); // view changed in B
election.getSendLeader().unblock(); // unblock the election thread
// before https://github.com/jgroups-extras/jgroups-raft/pull/284
// assertEquals("A", a.untilLeader(3)); // TODO A become the leader without majority reached
// assertThrows(TimeoutException.class, () -> a.setAsync("cmd1".getBytes()).get(3, SECONDS));
election.untilVotingThreadStop(3);
assertNull(a.leader());
assertThrows(RaftLeaderException.class, () -> a.setAsync("cmd1".getBytes()).get()).printStackTrace();
b.untilLeader("A", 3); // TODO got electedLeader from A after view change event, miss the chance to remove the leader
assertThrows(ExecutionException.class, () -> b.setAsync("cmd1".getBytes()).get()).printStackTrace();
}
}
Regarding https://github.com/jgroups-extras/jgroups-raft/issues/259, my test case still fails in 1.0.13.
I found the fixing code where under
reached
andleader_lost
cases which stop the existing voting thread before start a new one, but in my test case it'sno_change
.