Open akdor1154 opened 6 years ago
seems like this is still an active bug in docker despite being reported in almost every single release. It makes running docker in prod unstable.
Hi guys. I believe we have the same case here. Docker daemon stuck , a lot of IO Wait goroutines. This happend after docker daemon started rapidly to consume memory. Please sugest , is it the same issue or another open/closed ? Attaching stack dump. goroutine-stacks-2020-09-03T143546+0900.zip
docker version Client: Version: 1.13.1 API version: 1.26 Package version: docker-1.13.1-108.git4ef4b30.el7.centos.x86_64 Go version: go1.10.3 Git commit: 4ef4b30/1.13.1 Built: Tue Jan 21 17:16:25 2020 OS/Arch: linux/amd64
Server: Version: 1.13.1 API version: 1.26 (minimum version 1.12) Package version: docker-1.13.1-108.git4ef4b30.el7.centos.x86_64 Go version: go1.10.3 Git commit: 4ef4b30/1.13.1 Built: Tue Jan 21 17:16:25 2020 OS/Arch: linux/amd64 Experimental: false
@koryaga docker 1.13.x reached EOL over 3 years ago, and is no longer maintained; from the commit that our version of docker was built, I think you're running the Red Hat fork of Docker (https://github.com/projectatomic/docker/commit/4ef4b30c57f05be26c9387ef0828e86c2ed543b8), which is not maintained here, and has many modifications that are not upstream, and have caused problems on many occasions; I'd recommend to either install a current version of the official docker packages (https://docs.docker.com/engine/install/centos/), or open a ticket in the Red Hat issue tracker.
Thanks @thaJeztah . I realize that RH version is a legacy. However we are obligated to use it ) I will definitly raise an issue to Centos/Redhat. But is it possible, looking into stacktrace, tied our issue with current one.
Description
After running docker on my service for around 24 hours, the daemon will become unresponsive when doing certain operations. For example,
docker ps
seems to work, butdocker logs [container]
will hang, as doesdocker stop [container]
. This has been reproduced with both the jsonfile and systemd log drivers. It seems to occur more frequently with the jsonfile logger, however this is entirely anecdotal and I have changed how often this thing reboots at the same time I changed the log driver so that claim is entirely unreliable.Steps to reproduce the issue:
Leave server on for 24 hours running CI builds, which result in maybe 20 containers an hour on average being created, run, and removed. There is a cleanup script scheduled to run every eight hours or so to kill long-running containers and remove unused containers and images. This issue seems to occur with or without the cleanup script running. Sometimes the the cleanup script itself is affected by the issue and will hang while running.
Flip a coin or something, this is intermittent. However, once it starts occuring, it will occur predictably until dockerd is killed.
Pick a container from
docker ps
, rundocker stop [container]
ordocker logs [container]
. Docker client will freeze.Describe the results you received: Docker client will freeze or time out
Describe the results you expected: Docker server responds to the client in time for it to not freeze or time out.
Additional information you deem important (e.g. issue happens only occasionally):
Dump while dockerd is idle (docker logs blah was run and hung about an hour before I got this trace)
Trace while
docker logs
is running and hangingOutput of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.): Ubuntu Artful x86-64 running on AWS t2.medium instance.