docker-archive / classicswarm

Swarm Classic: a container clustering system. Not to be confused with Docker Swarm which is at https://github.com/docker/swarmkit
Apache License 2.0
5.75k stars 1.08k forks source link

Host IP switch is breaking docker swarm #2933

Closed quentin-legraverend closed 4 years ago

quentin-legraverend commented 5 years ago

Hi all,

I know that having static IPs on hosts where docker daemons are deployed is a good practice. This issue is in the case where your Docker hosts are (for exemple) getting their IPs from DHCP server.

EDIT: here is a doc page talking about static IP in swarm (especially for manager nodes). So this issue, might not be an issue 😃

In this case, if the IPs of an host switches for any reason (reboot, DHCP server). All the "overlay things" in swarm will be broken. I don't know if this behaviour is intended that's why I'm opening this issue.

The technical aspect in background is : once you've run docker swarm init the manager will still advertising itself on the inital IP even if that one changes.

Step te reproduce :

  1. Configure 2 hosts behind a router / DHCP server (OS + Docker)
  2. Run docker swarm init on one of them
  3. Poweroff this host, configure your DHCP server to give to this host a different IP
  4. Reboot the server. Docker should restart automatically and swarm too.
  5. Run docker swarm join-toker worker, you might notice that docker give you the old IP of the manager node.
  6. Copy the token + IP:HOST
  7. Paste it on the second host (take care to switch the IP of the manager for the new one).
  8. The second host will join the swarm as a worker.
  9. Deploy a stack like portainer => You will notice that inside the portainer_agent_network (which is on overlay network) the services / containers won't be able to communicate with each others. Portainer will only collect information of the agent present on the same host.

Giving Swarm the capacity to change the advertising IP of a swarm node on boot or manually (according to the new IP on the system) will solve the issue but might cause other problems (like worker nodes not finding the manager after reboot).

Actually I encountered this issue and I solved it by configuring DHCP and restoring the old IP (and I use static one now, that way, it will survive reboot).

Hope that's clear and helps !

Dzhuneyt commented 5 years ago

Manager nodes need static IPs. That's a hard rule, afaik.

justincormack commented 4 years ago

This repo is not for Docker swarm you are looking for https://github.com/docker/swarmkit