docker / for-linux

Docker Engine for Linux
https://docs.docker.com/engine/installation/
753 stars 85 forks source link

Unstable Docker Socket #900

Open justgotthis opened 4 years ago

justgotthis commented 4 years ago

Expected behavior

Docker socket communication should be stable.

Actual behavior

Docker socket communication is intermittent, cutting in and out randomly.

[2020-01-09 19:25:18] ++ docker -H unix:///var/run/docker.sock inspect '--format={{.State.Running}}' proxy
[2020-01-09 19:25:18] Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

Steps to reproduce the behavior

Leave docker running and the Docker socket gets impacted intermittently. Unable to reproduce at will.

Output of docker version:

Client: Docker Engine - Community
 Version:           19.03.1
 API version:       1.39 (downgraded from 1.40)
 Go version:        go1.12.5
 Git commit:        74b1e89e8a
 Built:             Thu Jul 25 21:21:35 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.3
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       774a1f4
  Built:            Thu Feb 28 05:59:55 2019
  OS/Arch:          linux/amd64
  Experimental:     false

Output of docker info:


Client:
 Debug Mode: false

Server:
 Containers: 12
  Running: 7
  Paused: 0
  Stopped: 5
 Images: 115
 Server Version: 18.09.3
 Storage Driver: aufs
  Root Dir: /var/lib/docker/aufs
  Backing Filesystem: extfs
  Dirs: 1052
  Dirperm1 Supported: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
 init version: fec3683
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 4.4.98+
 Operating System: Ubuntu 16.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 24
 Total Memory: 251.8GiB
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

WARNING: No swap limit support```

**Additional environment details (AWS, VirtualBox, physical, etc.)**
thaJeztah commented 4 years ago

Do the daemon logs show anything useful?

Note that 18.09.3 is not the latest patch release for the 18.09 release (v18.09.9 is the latest patch release for that). Also, 18.09.x reached EOL.

From your example:

[2020-01-09 19:25:18] ++ docker -H unix:///var/run/docker.sock inspect '--format={{.State.Running}}' proxy
[2020-01-09 19:25:18] Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

I notice the timestamps in front of those commands; are you running these steps from a script? Is that script running in a container?

justgotthis commented 4 years ago

I did not get anything in daemon logs with DEBUG mode unfortunately:

Jan 14 03:29:27 host dockerd[29692]: time="2020-01-14T03:29:27.054521805+05:30" level=debug msg="Calling GET /v1.38/containers/json?filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dpodsandbox%22%3Atrue%7D%7D&limit=0" Jan 14 03:29:27 host dockerd[29692]: time="2020-01-14T03:29:27.070729477+05:30" level=debug msg="Calling GET /v1.38/containers/json?all=1&filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dcontainer%22%3Atrue%7D%2C%22status%22%3A%7B%22running%22%3Atrue%7D%7D&limit=0" Jan 14 03:29:27 host dockerd[29692]: time="2020-01-14T03:29:27.092481067+05:30" level=debug msg="Calling GET /v1.38/containers/json?filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dpodsandbox%22%3Atrue%7D%7D&limit=0" Jan 14 03:29:27 host dockerd[29692]: time="2020-01-14T03:29:27.107508570+05:30" level=debug msg="Calling GET /v1.38/containers/json?all=1&filters=%7B%22label%22%3A%7B%22io.kubernetes.docker.type%3Dcontainer%22%3Atrue%7D%2C%22status%22%3A%7B%22running%22%3Atrue%7D%7D&limit=0" Jan 14 03:29:27 host dockerd[29692]: time="2020-01-14T03:29:27.166262264+05:30" level=debug msg="Calling GET /v1.31/containers/json?limit=0" Jan 14 03:29:27 host dockerd[29692]: time="2020-01-14T03:29:27.355843800+05:30" level=debug msg="Calling GET /v1.31/containers/json?limit=0" . . Jan 14 03:29:27 host dockerd[29692]: time="2020-01-14T03:29:27.392279568+05:30" level=debug msg="Calling GET /v1.31/containers/1f55ad426540d6691a7ba44c8cd1df160c64d21453f94afcf03d40d223c60a58/stats?stream=0" Jan 14 03:29:27 host dockerd[29692]: time="2020-01-14T03:29:27.392847036+05:30" level=debug msg="Calling GET /v1.31/containers/a786ac9f52f6ee68f3cf5615fb833ba1434adf3866961ca485771c5dc8720c55/stats?stream=0"

You are correct, these log entries are from a script which takes the status of docker. As soon as the communication through the socket goes down, it complains that it cannot connect to the Docker daemon. At this point, we stop receiving logs from containers as there is no more communication to the Docker daemon. Any insight would be helpful.

Thanks, -Luis.