moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.47k stars 18.63k forks source link

Error response from daemon: rpc error: code = 4 desc = context deadline exceeded #31758

Closed apophizzz closed 7 years ago

apophizzz commented 7 years ago

Description

I've got a test environment for Docker Swarm Mode locally on my machine, conforming to the setup of this Docker Lab. The setup is made up of 3 manager as well as 3 worker nodes. When trying to restart the existing swarm with all machines in stopped state, docker node ls is running into a timeout when I start with bringing back the leading manager node (let's call it manager1) to life via Docker Machine. The error message I get is: Error response from daemon: rpc error: code = 4 desc = context deadline exceeded

The other node's machines are still disabled at this point. The interesting thing is that as soon as I start another one of the remaining manager nodes (manager2 or manager3) and repeat the command on manager1 (leader) afterwards, it works as expected and gives me output like this:

docker@manager1:~$ docker node ls
ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
15350ucxnl5p9w7x2n8n86lzr    manager2  Ready   Active        Reachable
d35uhev7ix5amdmgjaj2yqnel    manager3  Down    Active        Unreachable
i91shs8i3ips709cm09fxi1ha    worker2   Down    Active        
m3b8zejcrgvu8zrjl8huhcp87    worker1   Down    Active        
tz4zwf4f3ekfbw80gj6fpdaag *  manager1  Ready   Active        Leader
uvnwzzvmhlzwd1x3wty9gqcev    worker3   Down    Active

Even if I now decide to shutdown manager2 again, everything keeps working fine. I'm not sure if that's a bug at all or if re-initializing the Raft cluster simply needs a majority of available managers (2 in this case) for being able to return into operating state. Moreover, I made an interesting observation examining the output of docker info before and after restarting a second manager. With only manager1 running and getting the described error message, the Raft section of the docker info output says:

Raft:
  Snapshot Interval: 0
  Heartbeat Tick: 0
  Election Tick: 0

Along with the error message, that makes totally sense. After having started manager2 (error vanished):

Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3

Steps to reproduce the issue:

  1. Create a swarm including 3 managers and 3 workers as described [here].(https://github.com/docker/labs/tree/master/swarm-mode/beginner-tutorial)
  2. Shutdown every single machine with docker-machine stop manager1 (repeat for all machines) without explicitly leaving the existing swarm.
  3. Make sure that every machine has been shut down.
  4. Now start the manager node which has been the Raft leader with docker-machine start manager1.
  5. Connect to the restarted leader node via SSH and execute docker node ls. You should get the error message I described.
  6. Now, start a second manager node (manager2 for me), which has been a follower before.
  7. After the follower is back online, reconnect to the leader node and run docker node ls again. Alternatively, you can also execute this command on the follower node, which should not make any difference.
  8. You should now see a proper list of six manager and worker nodes, showing that two managers are currently ready and reachable.
  9. Shutdown the second manager you started in step 6 via Docker Machine (here: manager2) and execute docker node ls again on manager1. Again, you should get a list of nodes, showing that only manager1 is currently ready and reachable.

Expected results: Instead of facing an error message after only the leading manager node has come back online, I'd expect the same output concerning the cluster state I get after starting another manager node, i.e.:

docker@manager1:~$ docker node ls
ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
15350ucxnl5p9w7x2n8n86lzr    manager2  Down    Active        Unreachable
d35uhev7ix5amdmgjaj2yqnel    manager3  Down    Active        Unreachable
i91shs8i3ips709cm09fxi1ha    worker2   Down    Active        
m3b8zejcrgvu8zrjl8huhcp87    worker1   Down    Active        
tz4zwf4f3ekfbw80gj6fpdaag *  manager1  Ready   Active        Leader
uvnwzzvmhlzwd1x3wty9gqcev    worker3   Down    Active

Output of docker version:

Client:
 Version:      17.03.0-ce
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   3a232c8
 Built:        Tue Feb 28 07:52:04 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.03.0-ce
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   3a232c8
 Built:        Tue Feb 28 07:52:04 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 17.03.0-ce
Storage Driver: aufs
 Root Dir: /mnt/sda1/var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 3
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: pending
 NodeID: tz4zwf4f3ekfbw80gj6fpdaag
 Error: rpc error: code = 4 desc = context deadline exceeded
 Is Manager: true
 ClusterID: 
 Managers: 0
 Nodes: 0
 Orchestration:
  Task History Retention Limit: 0
 Raft:
  Snapshot Interval: 0
  Heartbeat Tick: 0
  Election Tick: 0
 Dispatcher:
  Heartbeat Period: Less than a second
 CA Configuration:
  Expiry Duration: Less than a second
 Node Address: 192.168.99.100
 Manager Addresses:
  0.0.0.0:2377
  192.168.99.100:2377
  192.168.99.101:2377
  192.168.99.102:237
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 977c511eda0925a723debdc94d09459af49d082a
runc version: a01dafd48bc1c7cc12bdb01206f9fea7dd6feb70
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.4.52-boot2docker
Operating System: Boot2Docker 17.03.0-ce (TCL 7.2); HEAD : f11a204 - Thu Mar  2 00:14:47 UTC 2017
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 995.8 MiB
Name: manager1
ID: NKBV:TJXO:54FN:N3CI:UX6H:YKC7:DKOL:HBUI:ML7X:H5BW:OXXI:KLKC
Docker Root Dir: /mnt/sda1/var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 25
 Goroutines: 68
 System Time: 2017-03-11T17:19:49.538566197Z
 EventsListeners: 0
Registry: https://index.docker.io/v1/
Labels:
 provider=virtualbox
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):

Cheers, Patrick

aluzzardi commented 7 years ago

/cc @aaronlehmann

aaronlehmann commented 7 years ago

I'm not sure if that's a bug at all or if re-initializing the Raft cluster simply needs a majority of available managers (2 in this case) for being able to return into operating state.

In general the Raft cluster needs a majority of managers to be available, otherwise it can't service most requests. Even though you're still able to do node ls after taking down all but one manager, trying to do something that changed the state, such as creating or updating a service, would fail.

There's more information about this at the following links:

https://docs.docker.com/engine/swarm/raft/ https://docs.docker.com/engine/swarm/admin_guide/

I'll close this issue, since the behavior seems to be expected. But feel free to follow up with questions.

apophizzz commented 7 years ago

@aaronlehmann thanks, what you explained makes perfectly sense to me. Changing cluster state always needs consensus of a manager majority. But I'm not sure if I already understood why I need a second manager node to come back online for the first manager to be able to report the global cluster state (i.e. recognizing that all the other nodes are down). I think of a scenario where each machine belonging to the cluster has been shutdown before. Am I right, that every manager node has a local copy of the entire cluster state rather than just a partial view of the cluster? If that assumption is true (otherwise feel free to correct me), can you think of a reason why docker node ls can't be satisfied from the local cluster store? Is that a Raft bootstrapping thing? Thanks for your patience :)

aaronlehmann commented 7 years ago

Am I right, that every manager node has a local copy of the entire cluster state rather than just a partial view of the cluster? If that assumption is true (otherwise feel free to correct me), can you think of a reason why docker node ls can't be satisfied from the local cluster store? Is that a Raft bootstrapping thing?

In order to make sure that a query like this returns the most up-to-date information, we internally route the queries to the Raft leader. The leader is the node that's currently coordinating the writes, so it's impossible for it to have out-of-date information.

I think the case where you're seeing docker node ls work with only one manager is one where that node still thinks it's the leader, so it's willing to respond to that query. But in the case where it doesn't work, the node doesn't know who the leader is, so it doesn't know where to send the query.

I believe that in the upcoming 17.04 release, nodes will automatically relinquish the leadership position if they notice that they aren't in contact with enough nodes to maintain a quorum. This will make the behavior more consistent.

Admittedly our error message about "context deadline exceeded" is unhelpful here. We've been meaning to improve these errors (cc @dperny).

Thanks for your patience :)

It's no problem at all.

apophizzz commented 7 years ago

Great, now I actually understand what was going on.

We've been meaning to improve these errors

I share your opinion that a significantly more meaningful message would be great here. The current one might a little bit confusing. So I'm done with that issue, thanks for participating.

lwalid commented 7 years ago

hi everybody i use docker 1.12.6, and i think that i have the seem problem. I have no answer for docker ps command or docker-compose ps command. in the log i have : Mar 27 15:37:40 test-ran.wanesy.fr dockerd[1059]: time="2017-03-27T15:37:40.796732144+02:00" level=warning msg="Health check error: rpc error: code = 4 desc = context deadline exceeded"

avoidik commented 7 years ago

@lwalid your swarm doesn't have a quorum, as a temporary solution to this issue you can use:

 docker swarm init --force-new-cluster --advertise-addr <addr>:<port> --listen-addr <addr>:<port>

this will recreate the node from current state and will update managers/workers balance.