Closed lskbr closed 1 year ago
I found an workaround to start and stop docker without rebooting the machine:
RUNNING=`docker ps --no-trunc -q`
sudo systemctl stop docker
for c in $RUNNING; do
sudo rm -rf /var/lib/docker/containers/$c
done
sudo systemctl start docker
Containers disapear, but I got these error messages during delete (with docker stopped):
rm: cannot remove '/var/lib/docker/containers/ec946754ce2d2463f5b91fd9cbcbc76cbeec706c2d1409714e95bb7c07082d1c/shm': Device or resource busy
rm: cannot remove '/var/lib/docker/containers/522e7fc7af940d392d0d5c0354087579c78ac4424e16cf6308ccf5db8b577f6f/shm': Device or resource busy
rm: cannot remove '/var/lib/docker/containers/0c2e334268731214e54b49f519081f1ab0fac273d7e79d0451b661afe5803917/shm': Device or resource busy
I updated the kernel in all machines, but the error continues:
uname -a
Linux adam 4.8.0-52-generic #55~16.04.1-Ubuntu SMP Fri Apr 28 14:36:29 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
docker info
docker info
Containers: 14
Running: 2
Paused: 0
Stopped: 12
Images: 3
Server Version: 17.05.0-ce
Storage Driver: aufs
Root Dir: /export/docker/aufs
Backing Filesystem: extfs
Dirs: 85
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.8.0-52-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 20
Total Memory: 94.33GiB
Name: eve
ID: GINY:O4OX:IXRE:W3YU:ISJG:C36X:NNHN:5D44:3ZRE:4WZS:MR5B:75EO
Docker Root Dir: /export/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
192.168.1.X:5000
127.0.0.0/8
Live Restore Enabled: false
This error seems to be related to high memory usage. It happens when Linux start killing process (OOM). The bad point is that docker has to be restarted and the container files manually removed.
Monitoring the oom_score would be helpful
ls /proc/*/oom_score | awk '{print system("cat $1") " " $1 }' | sort
Also docker has ways to deal with the oom killer, like when you run a container
docker run --help | grep oom
--oom-kill-disable Disable OOM Killer
--oom-score-adj int Tune host's OOM preferences (-1000 to 1000)
I love the title - can't wait for the sequel... Die harder containers
Hi got the "die hard containers" again today.
I run ls /proc/*/oom_score | awk '{print system("cat $1") " " $1 }' | sort
The output is 0 for all processes.
These containers cannot be stopped or killed. They also restart when I boot the machine. The only way I found to stop them is to stop docker, wipe the container directory and start docker again.
Let me close this ticket for now, as it looks like it went stale.
Hi,
I have been using docker since two years ago. I have a small cluster for development purposes where I use docker swarm to coordinate and distribute my containers. From time to time, I started to have problems killing containers. They do not give any errors on docker stop or docker kill, but they are always shown as running. I cannot stat these containers, but I can inspect them. If I restart the computer, they continue to be listed as running containers. It seems that they do not consume any memory or cpu (at least no significant amound of it).
There is nothing special about the containers I create. I use a custom built python image in +100 containers. Most of the time I have no problems stopping and restarting them. When I cannot kill them, I log into each node with an undead container and execute the following procedure:
I get the die hard containers id with:
When the machine restarts:
This simple procedure needs to be executed in every machine with undead containers :-( It is timing consuming and it breaks my deployment scripts.
My current setup uses 7 machines running Linux:
There containers are started with the follwing template:
uname -a
docker info