PikaLabs / floyd

A raft consensus implementation that is simply and understandable
GNU General Public License v3.0
321 stars 106 forks source link

Membership change may cause poor availability #25

Open CatKang opened 6 years ago

CatKang commented 6 years ago

A new server who is not added into cluster yet, will constantly send RequestVote RPC with new term number to others, and this will cause new raft election.

But the new server couldn't receive any AppendEntry RPC, so it will timeout and redo the same process, then cause a new election process again and again.

Certainly, this will result in very poor availability.

KernelMaker commented 6 years ago

use the order below to avoid this problem :-)

  1. update the new membership config to old cluster
  2. start the new server
baotiao commented 6 years ago

@CatKang you are right, since this is a early version of membership change.

I have consider these problems before, another trouble problem also cause poor availability is that the new server need a long time of recovery period, during this time, the cluster is also in poor availability situation.

I will fix these issues in the future or you can fix is yourself.