moby / swarmkit

A toolkit for orchestrating distributed systems at any scale. It includes primitives for node discovery, raft-based consensus, task scheduling and more.
Apache License 2.0
3.35k stars 612 forks source link

Manager listenAddress can't be changed #642

Open tonistiigi opened 8 years ago

tonistiigi commented 8 years ago

Stopping manager and starting it again with a different port still shows the old address in swarmctl manager ls. Joining a second manager to that node comes up as reachable but can't be used(probably because the connection is only one way).

After switching back the address:

» swarmctl  cert ls
Error: incorrect cluster state
WARN[0040] sending message to an unrecognized member ID 5e815a6c316dfc82
ERRO[0040] could not find cluster member to query for leader address

@abronan

abronan commented 8 years ago

@tonistiigi You launched the manager using the same state directory but changing the port, in this case you should use swarmctl manager rm to remove the reference to the old manager (with the old port) to proceed further with joining new members. Otherwise you'll end up with no leader as when you join back with a different port, it will send vote to what seems to be another member, leaving the cluster to be stuck.

/cc @aaronlehmann

We can maybe detect if we are restarting a manager with a different address while still pointing to the same state directory and handle that case by forcing a ConfChange remove on the old address/member registered? WDYT?

aaronlehmann commented 8 years ago

We can maybe detect if we are restarting a manager with a different address while still pointing to the same state directory and handle that case by forcing a ConfChange remove on the old address/member registered? WDYT?

I don't think we should ever force configuration changes except when the user explicitly asks us to with --force-new-cluster. Maybe we could just record the original address/port used for that state directory, and refuse to start with a different one.

Managers shouldn't change addresses or ports, since the raft cluster needs to know where to reach them. It's possible to use hostnames instead of IPs though, which may be useful in some situations.

tonistiigi commented 8 years ago

Managers shouldn't change addresses or ports, since the raft cluster needs to know where to reach them. It's possible to use hostnames instead of IPs though, which may be useful in some situations.

Most people will probably use the default 0.0.0.0 meaning that this happens automatically if the IP should change?

aaronlehmann commented 8 years ago

Most people will probably use the default 0.0.0.0 meaning that this happens automatically if the IP should change?

We print a warning recommending that people not use the default: https://github.com/docker/swarm-v2/blob/master/cmd/swarmd/manager.go#L28

There isn't really a good solution to this. If you are part of a raft cluster, other members need to know how to reach you. We make a best effort guess of the IP address if you don't specify one, but it's better to specify an IP or hostname that you know is stable. (And if you don't have a stable IP or hostname, you shouldn't be part of a multinode raft cluster anyway).

aluzzardi commented 8 years ago

@aaronlehmann Is this still relevant given the listen/advertise improvements of 1.12?

Also, there's a second issue in this ticket: changing advertised address. Should we log an issue for that? Should we attempt a fix for 1.13 or is that low priority?

aaronlehmann commented 8 years ago

I think it is still relevant.

Changing the advertise address is important to support but nontrivial. It probably needs more discussion before we choose a milestone.

aaronlehmann commented 7 years ago

Related to #2198.

PacAnimal commented 5 years ago

I recently had this issue with a cluster of Raspberry Pi's I wanted to move somewhat smoothly between networks. I found that adding iptables rules redirecting connections destined to the old manager addresses via NAT to the current addresses would get the swarm online, after which manager nodes can be removed and re-added one by one to update the raft addresses and such.

I kept fiddling around with this and came up with a script to do this automatically, and then proceeded to wrap the whole catastrophy in a Docker container.

It's far from perfect, but it should keep any cluster running on Debian, or a derivative like Raspbian, intact shortly after changing all the IP addresses.

In short, the script runs on all manager nodes, generates one SSH key per manager and distributes those to other nodes using services, relies on Avahi to find the new addresses, adds iptables NAT rules to get the swarm online, uses temporary services as distributed locks to re-join one by one, removes the iptables rules when they're no longer required, proceeds to re-invite any non-privileged workers, and then sits happily watching for swarm changes.

Feel free to fiddle around with it, draw inspiration or whatnot, and if you make improvements or other customizations, submit a pull request and I'll happily include it.

Oh, and there's no reason to tell me this is insane, or stupid. I know. I happened to have a hammer, and the problem looked like a nail.

EDIT: I was waffling on too long to include the links... https://github.com/b01t/swarm-glue https://hub.docker.com/r/b01t/swarm-glue