hashicorp / raft

Golang implementation of the Raft consensus protocol
Mozilla Public License 2.0
8.26k stars 995 forks source link

Question:does hashicorp/raft support ip dual-stack? #514

Closed Yanhao closed 2 years ago

Yanhao commented 2 years ago

Say I deployed hashicorp/raft at three dual-stack machines, and all call NewRaft() with v4 address to initialize raft group, at this time, the raft group should work fine I know, but if I restart and replace one machine with v6 address(call NewRaft() with v6 address), does the raft group will continue work fine? What I want is to replace v4 with v6 address step by step.

banks commented 2 years ago

There are several parts to this answer!

Network Address Compatibility

This raft lib has a StreamTransport abstraction so the core library doesn't know anything about network details at all. We do provide a generic NetworkTransport implementation layer, and a specific TCPTransport too. I don't know of any specific testing on the tcp transport for IPv6, but I don't see anywhere that we use v4-specific assumptions: the calling code has to provide a bindAddr string and an advertise net.Addr which can represent either address type.

Even if for some reason the provided TCP transport does not do the right thing for your dual-stack setup, it's possible to implement your own transport that does and still use this library. For example Consul uses it's own wrapper to set up TLS and some custom protocol multiplexing which means none of the networking details in this library are actually used anyway!

Server Identification

The other part of you question is whether changing the IP of the configured machine will break its raft config or make it appear as a new node in the raft cluster etc.

Earlier versions of this library did rely on the IP address being the unique identifier for a node in the raft configuration which made changing IPs but keeping the same state a problem.

Now though (assuming you don't have nodes still running the old raft protocol version) you should provide a stable Unique ID with the LocalID config param. If you do so then you should be able to have nodes reconfigure with new IP addresses as long as its LocalID and raft state remains the same they should be able to rejoin the cluster. There is no automatic mechanism for peers to reconfigure and discover the change though - your application needs to do that somehow (automated or operator driven).

However you orchestrate that, the current leader will need to call AddVoter again for the new configuration. Note the docs about how calling this for an existing peer ID updates the address in all server's raft config: https://github.com/hashicorp/raft/blob/91745625f50efb7098f6a87a6c485cf54c435c1b/api.go#L906-L914

For example in Consul, this is automated using our Gossip layer to discover an IP change.

Hope this is useful!