Open OmpahDev opened 7 years ago
Seems like this is similar to: https://github.com/docker/swarmkit/issues/1340
I had a similar issue (not on AWS though). From a healthy manager, I demoted the node being Down
and then promoted it again. That solved my issue.
My swarm is completely broken, it happened randomly, and I can't get it to get healthy again.
When I first started noticing that my swarm wasn't working, it showed that one of my managers was down. From then on everything I tried (deploying anything, etc.) failed. Googling the problem told me that this was because my swarm had lost quorum and the way to fix it was to run
docker swarm init --force-new-cluster
on my leader. However, this command fails.. it hangs for a while and then saysError response from daemon: context deadline exceeded
. I get the exact same error when I try to run adocker swarm leave --force
as well.How do I fix this and get my swarm back to healthy? I AM NOT going to delete the CloudFormation stack and re-create, that's WAY too much work, so any solution MUST not involve doing this, or replacing any of my AWS infrastructure in any way.