docker-archive / for-azure

27 stars 18 forks source link

Restarted demoted manager does not rejoin the swarm #60

Closed djeeg closed 6 years ago

djeeg commented 6 years ago

Expected behavior

Scale set instances can rejoin swarm after reboot

Actual behavior

Node not joined

Information

Start with 3 MANAGERS

 swarm-manager000003:~$ docker node ls
ID                            HOSTNAME              STATUS              AVAILABILITY        MANAGER STATUS
n4ox0f13dta     swarm-manager000000   Ready               Drain
i6kcjzg5wt8     swarm-manager000002   Ready               Active              Leader
isy1lqj6b17 *   swarm-manager000003   Ready               Active              Reachable
hwovd2ti366     swarm-worker000000    Ready               Active
lkym8b4dz0     swarm-worker000001    Ready               Active

docker node update --availability drain swarm-manager000000

Wait for tasks to transfer to other nodes

docker node demote swarm-manager000000

Restart in Azure

image

image

Wait for reboot and raft Instance does not rejoin the swarm

swarm-manager000003:~$ docker node ls
ID                            HOSTNAME              STATUS              AVAILABILITY        MANAGER STATUS
i6kcjzg5wt8     swarm-manager000002   Ready               Active              Leader
isy1lqj6b1r *   swarm-manager000003   Ready               Active              Reachable
hwovd2ti3     swarm-worker000000    Ready               Active
lkym8b4dz     swarm-worker000001    Ready               Active

Even though it has rebooted and can be connected to

swarm-manager000000:~$ docker node ls
Error response from daemon: This node is not a swarm manager. Worker nodes can't be used to view or modify cluster state. Please run this command on a manager node or promote the current node to a manager.
djeeg commented 6 years ago

Try to reimage - seems to get stuck at "deleting"

image

image

image

FrenchBen commented 6 years ago

Same as with the other clusters, you need to look at the init container logs to get a proper idea of what's going on and why it's not working as expected.

djeeg commented 6 years ago

Suspect duplicate https://github.com/docker/for-azure/issues/59