Closed glycerine closed 10 years ago
You're not restarting the clients with the correct command line arguments. raftd
interprets the first non-option argument as the data directory, and ignores subsequent ones silently. In your invocation, raftd -v -p 4003 localhost:4001 ~/node.3
you forgot the -join
argument preceding localhost:4001
and thus a directory called 'localhost:4001' is actually used for the configuration. You can confirm this by reading the log files for clients 2 and 3 and seeing that happen, for example:
Jasons-MacBook-Pro:ceptor jasona$ raftd -trace -p 4002 localhost:4001 ~/node.2
Raft trace debugging enabled.
2014/10/25 22:04:01 Initializing Raft Server: localhost:4001
[raft]22:04:01.045543 [40536ee Term:0] readConf.open localhost:4001/conf
[raft]22:04:01.045557 log.open.open localhost:4001/log
[raft]22:04:01.045606 log.open.create localhost:4001/log
[raft]22:04:01.045616 [40536ee Term:0] start as a new raft server
You thus have two clients sharing that configuration directory, and all kinds of weird things will happen.
Makes sense, thanks. The -join flag was omitted because raftd objects to it, but I didn't realize the address following it was the argument to join. Thanks again. - Jason
On osx 10.9.4 mavericks with go1.3.1
I started a 3 node raftd cluster, then killed off various servers. After I killed all servers and tried to restart them, I am now wedged and cannot start the 3rd server without getting a panic about leader.elected.at.same.term.345
full logs and data dirs here: https://drive.google.com/file/d/0BxpncsKbxeGWR2RTb3BOR2tiaWM/view?usp=sharing