Closed sile closed 1 month ago
@sile thank you for your ongoing contributions! We will get to this PR in the next couple of weeks.
@kjnilsson That sounds reasonable. Thank you for your comment!
I have incorporated the change in commit e6bbd1a. Here are the current results of the repro function:
> repro:run().
# create cluster
* [repro_b] init
* [repro_c] init
* [repro_a] init
* [repro_b] state_enter: recover
* [repro_c] state_enter: recover
* [repro_a] state_enter: recover
* [repro_b] state_enter: recovered
* [repro_c] state_enter: recovered
* [repro_a] state_enter: recovered
* [repro_b] state_enter: follower
* [repro_c] state_enter: follower
* [repro_a] state_enter: follower
* [repro_c] state_enter: pre_vote
* [repro_c] state_enter: candidate
* [repro_c] state_enter: leader
# Please wait 5 seconds...
# trigger election
ok
* [repro_a] state_enter: pre_vote
* [repro_a] state_enter: candidate
* [repro_c] state_enter: follower
* [repro_c] state_enter: pre_vote
* [repro_c] state_enter: candidate
* [repro_a] state_enter: follower # Unlike before the commit e6bbd1a, repro_a becomes follower after repro_c becomes candidate.
* [repro_c] state_enter: leader
* [repro_a] state_enter: await_condition
* [repro_a] state_enter: follower
Thank you for reviewing and merging this PR!
Proposed Changes
This PR addresses the issue reported in #439.
To summarize #439, if there is a candidate member and a pre_vote member where the pre_vote member has a higher log index than the candidate member, neither of them can ever be elected as the leader. (This holds true even if there are additional
N / 2 - 1
or fewer followers without election timers, whereN
is the cluster size.)This PR adds a branch to
ra_server:handle_candidate(#pre_vote_rpc{}, ...)
to handle cases where the pre_vote member has a higher log index. By the new branch, when such a message is received, the candidate transitions to the follower state.I think this is somewhat ad-hoc. However, since I don't know much about the
ra
code base (especially regarding the role of thepre_vote
state), I made a patch to minimize the impact range. Feel free to suggest any better alternative approaches.Closes #439.
FYI
By applying the patch for reproduction from issue #439 to this PR branch, the execution result became as follows:
Types of Changes
What types of changes does your code introduce to this project? Put an
x
in the boxes that applyChecklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask on the mailing list. We're here to help! This is simply a reminder of what we are going to look for before merging your code.CONTRIBUTING.md
documentFurther Comments
If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc.