Closed ankitsnlq closed 8 years ago
@ankitsnlq I think you cover 2 issues here. The first one looks like delay from Consul leader election https://www.consul.io/docs/guides/leader-election.html. The second issue on ps -a
should be fixed by #1465.
@ankitsnlq Unfortunately the delay for Leader Election (and in this case I think re-election) is a specificity of Consul. You can mitigate the issue by using a --replication-ttl
flag to a low value in case a re-election occurs somehow which seemed to be the case here (the Leader in Consul is not forever stable compared to etcd or zookeeper).
The issue with docker ps -a
should be fixed with #1465, can you try with the latest swarm:master
? If using the Swarm Image you can pull dockerswarm:swarm:latest
.
@abronan Yes i'm using dockerswarm:swarm:latest
stilll same issue. It begins after the No Leader election issue start. I will try with --replication-ttl
flag and update for if issue occurs again.
@ankitsnlq Oh wait failed to catch the docker ps -a
log at the end. It seems like this is a legitimate connection issue.
Can you give us more infos about your setup? Are you using docker-machine
to create the cluster? If not are you setting up your Managers to use TLS (I see the Primary Manager using the port 4000
but the agents exposed on port 2375
)?
Seems like this is a legitimate connection issue but can't be sure without more informations.
I'm not using Docker machine. Done all settings in Virtualbox with latest Docker, Swarm and Compose. Yes My swarm manager docker gcontainer running on port 4000
and agent docker container means the join command is using 2375
. Things was working fine with this setting. TLS is not used because i was doing testing in local environment so no security things was done.
Oh - it's dockerswarm/swarm:master
actually
I have the same problem now ! When it happend I'm not sure .
Hi @zengnjin, which one?
The Leader Election issue or the error trying to connect?
Leader election issue on store failure with Consul was fixed by #1552. Please make sure you update to swarm:master
or pull dockerswarm/swarm:master
.
Hi @ankitsnlq. Any update on this one? Did you manage to make it work with master
?
Closing, both issues should be fixed by now in master
. Feel free to open a new issue if you still encounter any problem. Thanks for reporting!
same problem . $ docker -H :4000 Containers: 0 Images: 0 Server Version: swarm/1.1.3 Role: replica Primary: Strategy: spread Filters: health, port, dependency, affinity, constraint Nodes: 0 Kernel Version: 3.10.0-327.el7.x86_64 Operating System: linux CPUs: 0 Total Memory: 0 B Name: e141a4722f25 $docker images REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE docker.io/swarm latest 81127fe5e9b4 6 weeks ago 18.11 MB $ docker -H :4000 ps Error response from daemon: No elected primary cluster manager $docker logs level=error msg="client: etcd cluster is unavailable or misconfigured
Hi @EamonZhang, are you using Consul? In this case, can you try with swarm:1.2.0
? Thanks.
Hi @abronan ,I am using ectd instead of consul,and has the same problem . This time my swarm is version 1.2.0
@EamonZhang Oh I actually missed that piece of logs, but level=error msg="client: etcd cluster is unavailable or misconfigured
suggests that the manager can't connect to the etcd server somehow through the client, are you using any kind of special setup for your etcd cluster, like Proxy mode or else?
Hi @abronan
etcd problem was solved.but the problem still exists .
[root@node1 opt]# docker -H :4000 ps Error response from daemon: No elected primary cluster manager [root@node1 opt]# docker -H :4000 info Containers: 0 Images: 0 Server Version: swarm/1.2.0 Role: replica Primary: Strategy: spread Filters: health, port, dependency, affinity, constraint Nodes: 0 Kernel Version: 3.10.0-327.el7.x86_64 Operating System: linux CPUs: 0 Total Memory: 0 B Name: 98519415381a
ps : it is test ok with consul.
I'm getting this error when I connect to a manager replica, I do not get the error if I connect to the primary manager.
my setup consists of 9 servers, 3 for consul, 3 for swarm managers, 3 for front end misc. I am using aws, private IP's and 3 subnets 1 in each zone, a,b,c
assume the managers are setup as such manager.zone-a.01 - replica manager.zone-b.02 - replica manager.zone-c.03 - primary
when I connect to the swarm on a replica using something like this
eval "$(docker-machine env --swarm manager.zone-a.01)"
or
eval "$(docker-machine env --swarm manager.zone-b.02)"
I get the error Error response from daemon: No elected primary cluster manager
when I connect to the primary
eval "$(docker-machine env --swarm manager.zone-c.03)"
I dont get the error and everything works.
I have two swarm manager, one is primary and another one is replica for it . But some time Master swarm gets role Replica and in primary row it's show self ip address as primary.I'm using consul for master election.
Here is what i get after
docker info
In the output we can see it is shwoing Replica but if you see the primary Ip line it is showing own ip address. So it got Role as Replica and even if it is primary.
Output of
ip r l
commandIn that situation if i give command to swarm manager it will show me
docker ps
After some time issue gets resolved and i get Role as Primary and docker ps command start working .
But after giving command
docker ps -a
An error occurred trying to connect: Get http://0.0.0.0:4000/v1.21/containers/json?all=1: EOF