docker / for-linux

Docker Engine for Linux
https://docs.docker.com/engine/installation/
754 stars 85 forks source link

unregister_netdevice waiting for IO #842

Closed zakikhani closed 4 years ago

zakikhani commented 4 years ago

Expected behavior

docker running and free unused ports for unregister_netdevice

Actual behavior

Container failure, stop/start, re-install not working , a lot of "unregister_netdevice: waiting for lo to become free. Usage count = 188"

Steps to reproduce the behavior

Every time we create a docker it works fine for a week or two then they fail

Output of docker version:

Client: Docker Engine - Community Version: 19.03.4 API version: 1.40 Go version: go1.12.10 Git commit: 9013bf583a Built: Fri Oct 18 15:54:09 2019 OS/Arch: linux/amd64 Experimental: false

Server: Docker Engine - Community Engine: Version: 19.03.4 API version: 1.40 (minimum version 1.12) Go version: go1.12.10 Git commit: 9013bf583a Built: Fri Oct 18 15:52:40 2019 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.2.10 GitCommit: b34a5c8af56e510852c35414db4c1f4fa6172339 runc: Version: 1.0.0-rc8+dev GitCommit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657 docker-init: Version: 0.18.0 GitCommit: fec3683

Output of docker info: Client: Debug Mode: false

Server: Containers: 1 Running: 0 Paused: 0 Stopped: 1 Images: 1 Server Version: 19.03.4 Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Native Overlay Diff: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: b34a5c8af56e510852c35414db4c1f4fa6172339 runc version: 3e425f80a8c931f88e6d94a8c831b9d5aa481657 init version: fec3683 Security Options: apparmor seccomp Profile: default Kernel Version: 4.15.0-62-generic Operating System: Ubuntu 18.04.3 LTS OSType: linux Architecture: x86_64 CPUs: 4 Total Memory: 15.54GiB Name: sc0001-culper-ring ID: TOHZ:4PR3:FG72:7A5N:ZP76:6NIJ:7K74:2NFD:TYHB:UJ5Y:STYB:YWXM Docker Root Dir: /var/lib/docker Debug Mode: false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false

WARNING: No swap limit support

Additional environment details (AWS, VirtualBox, physical, etc.)

uname -r

4.15.0-62-generic

4.15.0-62-generic #69-Ubuntu SMP Wed Sep 4 20:55:53 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

841 dmesg

unregister_netdevice: waiting for lo to become free. Usage count

arkodg commented 4 years ago

Can you please share the exact cmds/steps performed ?

zakikhani commented 4 years ago

Used Ansbile to configure Docker in multiple server to install TousandEyes. Everything works fine on a fresh server, but after few weeks the container goes down. When you try to start it it hangs, so have to do "sudo kill -9 pid" it's mostly runc stock in init. when it happens you can't run any container, even hello-world is not working. systemctl start/stop is not fixing it. remove/reinstlal (including the docker folder /var/lib/docker) is not fixing it dmseg [Oct29 16:27] unregister_netdevice: waiting for lo to become free. Usage count = 188 [ +10.079606] unregister_netdevice: waiting for lo to become free. Usage count = 188 [ +10.079417] unregister_netdevice: waiting for lo to become free. Usage count = 188 [ +10.079498] unregister_netdevice: waiting for lo to become free. Usage count = 188 [ +10.079556] unregister_netdevice: waiting for lo to become free. Usage count = 188 [ +10.079435] unregister_netdevice: waiting for lo to become free. Usage count = 188 [Oct29 16:28] unregister_netdevice: waiting for lo to become free. Usage count = 188 [ +10.079491] unregister_netdevice: waiting for lo to become free. Usage count = 188 [ +10.111446] unregister_netdevice: waiting for lo to become free. Usage count = 188 [ +10.079453] unregister_netdevice: waiting for lo to become free. Usage count = 188 [ +10.079443] unregister_netdevice: waiting for lo to become free. Usage count = 188

andrewhsu commented 4 years ago

Hmm...looks like this moby issue has similar message about unregister_netdevice: https://github.com/moby/moby/issues/5618

taskset commented 4 years ago

Hmm...looks like this moby issue has similar message about unregister_netdevice: moby/moby#5618

The relevant upstream patches are: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=d747a7a51b00984127a88113cdbbc26f91e9d815 and https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ee60ad219f5c7c4fb2f047f88037770063ef785f

You may also refer to the redhat information: https://access.redhat.com/solutions/3659011 https://access.redhat.com/solutions/3659011

thaJeztah commented 4 years ago

Looks like this issue is related to a kernel issue, not an issue in the docker daemon; I'll close this issue, but feel free to continue the conversation if you think I closed this in error