AJNOURI / Docker_Certified_Associate_Certification

Preparation for Docker Certified Associate certification exam.
MIT License
4 stars 2 forks source link

All Swarm workers have STATUS Down #2

Open AJNOURI opened 6 years ago

AJNOURI commented 6 years ago

After a backup & restore of a cluster, All workers have STATUS Down

docker node ls

ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS owrqevo5ubcy0aqu8jm324g11 * ajnouri4.mylabserver.com Ready Active Leader og3e9ccf5pcjb1uinzsln9vl9 ajnouri5.mylabserver.com Down Active
qsc44dvdqic6stir5zqnirp4b ajnouri5.mylabserver.com Down Active
m6ymjtebnlt8dmjduh1s7l3kr ajnouri6.mylabserver.com Down Active
pk4ch7o20gv3allwhpln8vzj8 ajnouri6.mylabserver.com Down Active

Eventhough, the service is restored correctly, but looks like hosted on the manager only:

docker service ls

ID NAME MODE REPLICAS IMAGE PORTS cln2fxtjdnf2 backupweb replicated 2/2 httpd:latest *:80->80/tcp

AJNOURI commented 6 years ago

Important condition: For training purposes, I am using a hosted service with dynamic IP addresses renewed each time the servers are rebooted, and they are rebooted after 2 hours.

Of course the workers after reboot were trying to join the manager with the wrong IP.

The solution, partially, was to remove the workers from the cluster nd rejoin using the new manager IP.

docker swarm leave

Node left the swarm.

docker swarm join --token SWMTKN-1-06xygghtoyyvchg736y1wet3li909s8nrzol8oubov9q85icu2-8vmn0d4rb3g34bw7uqycpsisw 172.31.27.4:2377

This node joined a swarm as a manager.

That didn't work for all workers ==>https://github.com/AJNOURI/Docker_Certified_Associate_Certification/issues/3