Closed xiang90 closed 10 years ago
@unihorn Can you test this against etcd? Thanks.
@benbjohnson Our addPeer and removePeer are not safe if the leader dies before all the nodes commit the command. But I am not worrying about this right now.
@xiangli-cmu I cherry-pick it into etcd, and etcd passes all its tests.
@unihorn Fix the race in remove test?
@xiangli-cmu Yes, I think so. It is great!! :)
@xiangli-cmu lgtm
lgtm too. We need to document/refactor these locks.
@philips I was trying to refactor the heartbeat and log lock. But the result was not that great. I am planning to do it more carefully next week.
@xiangli-cmu Sounds good. Thanks for taking it on.
When server entries removePeer, it is holding log write lock. peer.stopHeartbeat might also need to acquire log read lock to finish. We need to make peer.stopHeartbeat non-blocking to fix the deadlock. Also actually there is no need to wait for the peer go-routine to stop.