moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.84k stars 18.67k forks source link

0.7.0 fails to remove containers #2714

Closed ndarilek closed 10 years ago

ndarilek commented 11 years ago

Script started on Fri 15 Nov 2013 04:28:56 PM UTC root@thewordnerd:~# uname -a Linux thewordnerd.info 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux root@thewordnerd:~# docker version Client version: 0.7.0-rc5 Go version (client): go1.2rc4 Git commit (client): 0c38f86-dirty Server version: 0.7.0-rc5 Git commit (server): 0c38f86-dirty Go version (server): go1.2rc4 Last stable version: 0.6.6, please update docker root@thewordnerd:~# docker rm docker ps -a -q Error: Cannot destroy container ba8a9ec006c8: Driver devicemapper failed to remove root filesystem ba8a9ec006c8e38154bd697b3ab4810ddb5fe477ed1cfb48ac3bd604a5a59495: Error running removeDevice Error: Cannot destroy container d2f56763e65a: Driver devicemapper failed to remove root filesystem d2f56763e65a66ffccb3137017dddad745e921f4bdaa084f6b4a0d6407ec030a: Error running removeDevice Error: Cannot destroy container c22980febe50: Driver devicemapper failed to remove root filesystem ...

amuino commented 10 years ago

Same with docker 0.8.0. All attempts to remove a container after it exits fail.

I have noticed that restarting the docker daemon clears the list of non-running containers.

The following is an example using busybox.

vagrant@vagrant-ubuntu-saucy-64:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
vagrant@vagrant-ubuntu-saucy-64:~$ docker run -name bb busybox
Unable to find image 'busybox' (tag: latest) locally
Pulling repository busybox
769b9341d937: Download complete
511136ea3c5a: Download complete
bf747efa0e2f: Download complete
48e5f45168b9: Download complete
vagrant@vagrant-ubuntu-saucy-64:~$ docker rm bb
Error: container_delete: Cannot destroy container bb: Driver devicemapper failed to remove root filesystem 1f9f56f452334cc53d7012104e65cecf077f035722b4579be023cc6cb383e013: Error running removeDevice
2014/02/10 09:10:50 Error: failed to remove one or more containers
vagrant@vagrant-ubuntu-saucy-64:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND              CREATED             STATUS              PORTS               NAMES
1f9f56f45233        busybox:latest      /bin/sh -c /bin/sh   24 seconds ago      Exit 0                                  bb
vagrant@vagrant-ubuntu-saucy-64:~$ sudo restart docker
docker start/running, process 17985
vagrant@vagrant-ubuntu-saucy-64:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
johnae commented 10 years ago

I wonder if this is specific to docker daemon on ubuntu. I installed docker manually on debian jessie and there I have no problems. I then proceeded to install docker on ubuntu 13.10 manually (eg. not via apt) and I still have the same issue.

johnae commented 10 years ago

Downgrading to docker 0.7.6 works for me so it's something with 0.8.0.

sgrimee commented 10 years ago

No symlink in my case, but extensive use of VOLUMES mounted to NFS folders on the host. The workaround from @LordFPL works for me, after that the containers can be removed.

#!/bin/bash

echo "This command may be dangerous... see http://www.unix.com/linux/119194-umount-l-dangerous.html"

for container in $(docker ps -a --no-trunc | grep Exit | awk '{print $1}');
do
        fs=$(cat /proc/mounts | grep $container | awk '{print $2}')
        echo "Lazy unmounting $fs for container $container"
        sudo umount -l $fs
done
barnybug commented 10 years ago

Seeing this error on arch using devicemapper. /var/lib/docker is not a symlink, no VOLUME involved. container_delete: Cannot destroy container 060a53fd92ce45b7391c01b1a0675bf5472df31b78ca3cb79e655a41bdcff4b0: Driver devicemapper failed to remove root filesystem 060a53fd92ce45b7391c01b1a0675bf5472df31b78ca3cb79e655a41bdcff4b0: Error running removeDevice

lox commented 10 years ago

FYI we're having the same problem with a mysql data volume, no NFS involved. Docker 0.8 and Ubuntu 13.10.

lox commented 10 years ago

It seems like even though there is an error, the volume is deleted. If you restart docker AFTER seeing this error, and then run docker ps again, you'll see it's gone.

sameersbn commented 10 years ago

Specifying -g option fixed the issue for me. Earlier I was symlinking /var/lib/docker.

SvenDowideit commented 10 years ago

can we add some code to docker to test for symlinks and either exit and error out, or convert them to full paths? (a symlinked /tmp is also bad for lxc)

johnae commented 10 years ago

Just want to point out that I have these problems with docker 0.8.0 and not 0.7.6. I'm using a completely clean, all defaults vagrant box ubuntu 13.10. No symlinking going on or anything, 0.8.0 just won't remove containers for some reason.

I basically install docker, pull an image (like stackbrew/ubuntu:13.10) and run it without any modifications at all. Can't remove it afterwards.

logankoester commented 10 years ago

Yep, same here @johnae. Both 0.7.6 and 0.8.0, on a fresh install of Arch.

ruebenramirez commented 10 years ago

I cleaned out all my containers by doing the following:

I manually cleaned out the mountpoints with:

grep docker /etc/mtab | xargs sudo umount

Once the mountpoints were gone, I could then remove the containers:

sudo docker rm $(sudo docker ps -q -a)

More info in my blog post on this: http://ruebenramirez.com/ruebenramirezcom/394-clean-up-docker-mountpoints

dz0ny commented 10 years ago

Same error here, reboot helps.

mount
/dev/sdb4 on / type ext4 (rw,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/cgroup type tmpfs (rw)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type devtmpfs (rw,mode=0755)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
none on /run/shm type tmpfs (rw,nosuid,nodev)
none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
none on /sys/fs/pstore type pstore (rw)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu)
cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,relatime,hugetlb)
/dev/sda2 on /media/steam type ext4 (rw,nosuid,nodev,_netdev)
systemd on /sys/fs/cgroup/systemd type cgroup (rw,noexec,nosuid,nodev,none,name=systemd)
gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,user=dz0ny)
Linux work 3.13.0-12-generic #32-Ubuntu SMP Fri Feb 21 17:45:10 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux - Ubuntu Trusty Tahr (development branch)
docker version
Client version: 0.8.1
Go version (client): go1.2
Git commit (client): a1598d1
Server version: 0.8.1
Git commit (server): a1598d1
Go version (server): go1.2
Last stable version: 0.8.1
lxc-checkconfig 
Kernel configuration not found at /proc/config.gz; searching...
Kernel configuration found at /boot/config-3.13.0-12-generic
--- Namespaces ---
Namespaces: enabled
Utsname namespace: enabled
Ipc namespace: enabled
Pid namespace: enabled
User namespace: enabled
Network namespace: enabled
Multiple /dev/pts instances: enabled

--- Control groups ---
Cgroup: enabled
Cgroup clone_children flag: enabled
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: enabled
Cgroup cpuset: enabled

--- Misc ---
Veth pair device: enabled
Macvlan: enabled
Vlan: enabled
File capabilities: enabled
apt-cache showpkg lxc
Package: lxc
Versions: 
1.0.0+master~20140224-1500-0ubuntu1~ppa1~trusty1
benjaminbauer commented 10 years ago

+1

I can run a container:

docker ps -a
CONTAINER ID        IMAGE                      COMMAND             CREATED             STATUS              PORTS               NAMES
78a54f542a58        ubuntu/sparta_dev:latest   /bin/bash           9 seconds ago       Up 8 seconds                            naughty_fermat

Then I exit it:

docker ps -a
CONTAINER ID        IMAGE                      COMMAND             CREATED             STATUS              PORTS               NAMES
78a54f542a58        ubuntu/sparta_dev:latest   /bin/bash           22 seconds ago      Exit 0                                  naughty_fermat

When i try to run another container from the same image:

docker run -t -i ubuntu/sparta_dev:latest /bin/bash
2014/02/25 18:49:04 Error: start: Cannot start container e322e706a91df841b1e76fd52b01b3f4e58cf9e1f3aa0e793006f894041a2b9b: exit status 1

and:

docker ps -a
CONTAINER ID        IMAGE                      COMMAND             CREATED             STATUS              PORTS               NAMES
e322e706a91d        ubuntu/sparta_dev:latest   /bin/bash           6 seconds ago       Exit -1                                 furious_einstein    
78a54f542a58        ubuntu/sparta_dev:latest   /bin/bash           35 seconds ago      Exit 0                                  naughty_fermat  

when I try to remove them:

docker rm $(docker ps -a -q)
Error: container_delete: Cannot destroy container e322e706a91d: Driver devicemapper failed to remove root filesystem e322e706a91df841b1e76fd52b01b3f4e58cf9e1f3aa0e793006f894041a2b9b: Error running removeDevice
Error: container_delete: Cannot destroy container 78a54f542a58: Driver devicemapper failed to remove root filesystem 78a54f542a58853d5b5ce1cbbb0a16f5c8f90f26e00c6049b1950f64b011012a: Error running removeDevice
2014/02/25 18:49:16 Error: failed to remove one or more containers

I am running:

Client version: 0.8.0
Go version (client): go1.2
Git commit (client): cc3a8c8
Server version: 0.8.0
Git commit (server): cc3a8c8
Go version (server): go1.2
Last stable version: 0.8.1, please update docker
ndarilek commented 10 years ago

Any updates on a fix for this? There are a lot of data points that should help reproduce it.

Seems to me like it should be a higher priority. Clearly something is very messed up on a foundational level, and that makes using Docker challenging.

It bothers me a bit to see all sorts of effort put into solving the shiny problems, while these sorts of shlep issues slip. The new shiny needs to build on a solid foundation, and that Docker can't remove a container while it gains all sorts of shiny new features and refactors is worrysome to me when I suggest that people adopt/investigate it.

But maybe work is happening and I'm just not seeing it.

inthecloud247 commented 10 years ago

Maybe something to bring up at the next docker meetup or over IRC or the mailinglist.

It seems to be affecting a minority of people, and I'm assuming that the workarounds listed in this thread are probably working for people at this point.

On Tue, Feb 25, 2014 at 10:02 AM, Nolan Darilek notifications@github.comwrote:

Any updates on a fix for this? There are a lot of data points that should help reproduce it.

Seems to me like it should be a higher priority. Clearly something is very messed up on a foundational level, and that makes using Docker challenging.

It bothers me a bit to see all sorts of effort put into solving the shiny problems, while these sorts of shlep issues slip. The new shiny needs to build on a solid foundation, and that Docker can't remove a container while it gains all sorts of shiny new features and refactors is worrysome to me when I suggest that people adopt/investigate it.

But maybe work is happening and I'm just not seeing it.

— Reply to this email directly or view it on GitHubhttps://github.com/dotcloud/docker/issues/2714#issuecomment-36038012 .

bmurphy1976 commented 10 years ago

How do you know it affects only a minority of people?

We pushed very heavily to get docker into our infrastructure (dev, continuous integration, and production) and we run into this problem all the time. We haven't spent much time complaining about it, we just deal with it. But, honestly, it's obnoxious.

So I'm adding my $0.02. Please fix this one.

vieux commented 10 years ago

@ndarilek @bmurphy1976 @inthecloud247 you are using docker 0.8.1 with devicemapper right ?

johnae commented 10 years ago

@bmurphy1976 I certainly agree. For me, however, the issue was resolved in 0.8.1.

ndarilek commented 10 years ago

I'll give it a shot on 0.8.1 later today. I don't recall seeing anything on this ticket like "May be fixed in 0.8.1, please check" so I haven't specifically tested. If it is then I apologize for getting annoyed; this has been a huge pain point since 0.7, and it seemed like folks were still experiencing the problem very recently.

vieux commented 10 years ago

It might be fixed by #3948

amuino commented 10 years ago

Not sure what fixed it, but I can not reproduce my test case with 0.8.1-dev from master (probably 0.8.1 fixed it, but it was easier for me to test the dev version).

bmurphy1976 commented 10 years ago

I've seen nothing indicating that I should re-evaluate so I have not investigated the latest build thoroughly.

I just tried 0.8.1 on OSX using boot2docker and that fixed the immediate and reproducible problem I was running into. However, we are still running a mixture of 0.8.0 and 0.7.6 in our environments so it's going to take some time to work through them all.

PAStheLoD commented 10 years ago

Docker 0.8.1 (a1598d1), Debian Wheezy. Aufs. Setting -g solved it.

sgrimee commented 10 years ago

@PAStheLoD , are you setting -g to /var/lib/docker or something else? Isn't that the default?

PAStheLoD commented 10 years ago

Sorry, I was a bit too terse. I tried to imply that -g was necessary, I was using a symlink before. (As others before hinted at it.)

Also, I'm on 3.12.9, but it still could be an aufs bug.

bbradbury commented 10 years ago

+1 on using -g to set a non-default docker root path that contains no symlinks and fixes deletions using aufs. I think there may be two bugs in this discussion, one with aufs and one with devicemapper? My use case just concerns aufs.

Using docker 0.8.1, precise with an upgraded kernel (3.8.0-35-generic). Had a symlink the path, but switched to using an unsymlinked path thusly:

/etc/default/docker has DOCKER_OPTS="-g /home/docker"

docker rmi now works correctly.

unclejack commented 10 years ago

Running Docker on a symlinked /var/lib/docker root data folder is unsupported until #4382 is merged. This is currently causing problems because lxc can't handle symlinked paths and it can run into all sorts of problems.

Was anyone else running Docker on a symlinked root data directory?

ndarilek commented 10 years ago

OK, I've failed to duplicate this under 0.8.1. AFAICT it is fixed, thanks!

benjaminbauer commented 10 years ago

@unclejack Yes, I am

vieux commented 10 years ago

@benjaminbauer can you reproduce your issue with master ?

joelmoss commented 10 years ago

I am still experiencing this removal error with 0.9.0. I have no symlinked /var/lib/docker, and docker is running with /usr/bin/docker -d -H unix:///var/run/docker.sock -H tcp://0.0.0.0:4243 -r=false. It happens consistently 4-5 times out of every 100.

$ for i in `seq 1 100`; do docker run --rm=true ubuntu:12.04 sleep 2& done
2014/03/11 11:55:51 Error: Cannot destroy container f6e10b4a4fce8e27409765fc44e7173167d29bfd500ff89f082634087c844226: Driver aufs failed to remove root filesystem f6e10b4a4fce8e27409765fc44e7173167d29bfd500ff89f082634087c844226: rename /var/lib/docker/aufs/diff/f6e10b4a4fce8e27409765fc44e7173167d29bfd500ff89f082634087c844226 /var/lib/docker/aufs/diff/f6e10b4a4fce8e27409765fc44e7173167d29bfd500ff89f082634087c844226-removing: device or resource busy
2014/03/11 11:55:55 Error: Cannot destroy container 1b82e16811e3748c9e41c97bc227500e9f1cf0bc2ac5528a97f56a50ddd12dd2: Driver aufs failed to remove root filesystem 1b82e16811e3748c9e41c97bc227500e9f1cf0bc2ac5528a97f56a50ddd12dd2: rename /var/lib/docker/aufs/diff/1b82e16811e3748c9e41c97bc227500e9f1cf0bc2ac5528a97f56a50ddd12dd2 /var/lib/docker/aufs/diff/1b82e16811e3748c9e41c97bc227500e9f1cf0bc2ac5528a97f56a50ddd12dd2-removing: device or resource busy
2014/03/11 11:56:14 Error: Cannot destroy container ea46131d5a63fd29e3b8794aa147ea8e1274885dddbbd69e59bc74c8d7506534: Driver aufs failed to remove root filesystem ea46131d5a63fd29e3b8794aa147ea8e1274885dddbbd69e59bc74c8d7506534: rename /var/lib/docker/aufs/diff/ea46131d5a63fd29e3b8794aa147ea8e1274885dddbbd69e59bc74c8d7506534 /var/lib/docker/aufs/diff/ea46131d5a63fd29e3b8794aa147ea8e1274885dddbbd69e59bc74c8d7506534-removing: device or resource busy
2014/03/11 11:56:25 Error: Cannot destroy container ab3561d614ff273b6e52474da7cd9c49435b8748946ae205a77a04b74ff1a784: Driver aufs failed to remove root filesystem ab3561d614ff273b6e52474da7cd9c49435b8748946ae205a77a04b74ff1a784: rename /var/lib/docker/aufs/diff/ab3561d614ff273b6e52474da7cd9c49435b8748946ae205a77a04b74ff1a784 /var/lib/docker/aufs/diff/ab3561d614ff273b6e52474da7cd9c49435b8748946ae205a77a04b74ff1a784-removing: device or resource busy
2014/03/11 11:56:25 Error: Cannot destroy container a86222b7c8b2be98ac7ada7ccb099122c740e9212e7132db8dfcd0ca493916f3: Driver aufs failed to remove root filesystem a86222b7c8b2be98ac7ada7ccb099122c740e9212e7132db8dfcd0ca493916f3: rename /var/lib/docker/aufs/diff/a86222b7c8b2be98ac7ada7ccb099122c740e9212e7132db8dfcd0ca493916f3 /var/lib/docker/aufs/diff/a86222b7c8b2be98ac7ada7ccb099122c740e9212e7132db8dfcd0ca493916f3-removing: device or resource busy
ndarilek commented 10 years ago

Same here, once again, except it is Fedora and DM.

jamtur01 commented 10 years ago

@lseal285 Can you please turn off whatever is echo'ing your emails to GH? Thanks.

rockymtnlinux commented 10 years ago

I just discovered and started playing with Docker ( v0.8.1 packaged with Fedora 19 ) when I did a 'df' command and discovered this problem. From reading this thread it turns out that I had a couple of issues. The first is that I had moved /var/lib/docker to a different volume and had created a link to it. This was easily fixed by creating the file /etc/default/docker like so:

sudo echo 'DOCKER_OPTS="--restart=false --graph=/work/docker"' > /etc/default/docker

The second issue however is that I am running this on a laptop with a LUKS encrypted hard drive. After a bit of gnashing-of-hair and pulling-of-teeth trying to use lsof, fuser and other utilities to try and figure out what was holding on to the container volumes and preventing me from manually unmounting them, I discovered that the systemctl command gave me the info I needed:

$ sudo systemctl -a | grep docker
dev-disk...58ab1b6e.device loaded active   plugged   /dev/disk/by-id/dm-name-docker-253:11-22806530-d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e
dev-disk...\x2dpool.device loaded active   plugged   /dev/disk/by-id/dm-name-docker-253:11-22806530-pool
dev-mapp...58ab1b6e.device loaded active   plugged   /dev/mapper/docker-253:11-22806530-d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e
dev-mapp...\x2dpool.device loaded active   plugged   /dev/mapper/docker-253:11-22806530-pool
sys-devi...-docker0.device loaded active   plugged   /sys/devices/virtual/net/docker0
sys-subs...-docker0.device loaded active   plugged   /sys/subsystem/net/devices/docker0
work-doc...dockerenv.mount loaded active   mounted   /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root/.dockerenv
work-doc...ockerinit.mount loaded active   mounted   /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root/.dockerinit
work-doc...-hostname.mount loaded active   mounted   /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root/etc/hostname
work-doc...etc-hosts.mount loaded active   mounted   /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root/etc/hosts
work-doc...solv.conf.mount loaded active   mounted   /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root/etc/resolv.conf
work-doc...1b6e-root.mount loaded active   mounted   /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root
docker.service             loaded active   running   Docker Application Container Engine

After turning off the docker service, I was able to unmount the d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e container volume by first unmount'ing the /root/.dockerenv, /root/.dockerinit, /root/etc/hostname, /root/etc/hosts, and /root/etc/resolv.conf entries before unmount'ing the /root entry. At this point, I know just enough to be dangerous, but this makes me wonder if there isn't some sort of ordering issue going on when cleaning up the containers.

I was than able to use dmsetup to finish cleaning my system:

$ sudo dmsetup ls | grep docker
docker-253:11-22806530-d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e (253:14)
docker-253:11-22806530-pool (253:12)
$sudo dmsetup remove docker-253:11-22806530-d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e
$sudo dmsetup remove docker-253:11-22806530-pool
rockymtnlinux commented 10 years ago

I just saw the post by ruebenramirez :) --- You can get the same info from /etc/mtab.

nikicat commented 10 years ago

:+1:

inthecloud247 commented 10 years ago

Nice work! @rockymtnlinux

On Mar 15, 2014, at 11:05 PM, rockymtnlinux notifications@github.com wrote:

I just discovered and started playing with Docker ( v0.8.1 packaged with Fedora 19 ) when I did a 'df' command and discovered this problem. From reading this thread it turns out that I had a couple of issues. The first is that I had moved /var/lib/docker to a different volume and had created a link to it. This was easily fixed by creating the file /etc/default/docker like so:

sudo echo 'DOCKER_OPTS="--restart=false --graph=/work/docker"' > /etc/default/docker The second issue however is that I am running this on a laptop with a LUKS encrypted hard drive. After a bit of gnashing-of-hair and pulling-of-teeth trying to use lsof, fuser and other utilities to try and figure out what was holding on to the container volumes and preventing me from manually unmounting them, I discovered that the systemctl command gave me the info I needed:

$ sudo systemctl -a | grep docker dev-disk...58ab1b6e.device loaded active plugged /dev/disk/by-id/dm-name-docker-253:11-22806530-d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e dev-disk...\x2dpool.device loaded active plugged /dev/disk/by-id/dm-name-docker-253:11-22806530-pool dev-mapp...58ab1b6e.device loaded active plugged /dev/mapper/docker-253:11-22806530-d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e dev-mapp...\x2dpool.device loaded active plugged /dev/mapper/docker-253:11-22806530-pool sys-devi...-docker0.device loaded active plugged /sys/devices/virtual/net/docker0 sys-subs...-docker0.device loaded active plugged /sys/subsystem/net/devices/docker0 work-doc...dockerenv.mount loaded active mounted /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root/.dockerenv work-doc...ockerinit.mount loaded active mounted /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root/.dockerinit work-doc...-hostname.mount loaded active mounted /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root/etc/hostname work-doc...etc-hosts.mount loaded active mounted /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root/etc/hosts work-doc...solv.conf.mount loaded active mounted /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root/etc/resolv.conf work-doc...1b6e-root.mount loaded active mounted /work/docker/containers/d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e/root docker.service loaded active running Docker Application Container Engine I was able to unmount the d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e container volume by first unmount'ing the /root/.dockerenv, /root/.dockerinit, /root/etc/hostname, /root/etc/hosts, and /root/etc/resolv.conf entries before unmount'ing the /root entry. At this point, I know just enough to be dangerous, but this makes me wonder if there isn't some sort of ordering issue going on when cleaning up the containers.

I was than able to use dmsetup to finish cleaning my system:

$ sudo dmsetup ls | grep docker docker-253:11-22806530-d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e (253:14) docker-253:11-22806530-pool (253:12) $sudo dmsetup remove docker-253:11-22806530-d24b5acb12bd928b624f92582b19065d4afe89beafaa91e4404be96458ab1b6e $sudo dmsetup remove docker-253:11-22806530-pool — Reply to this email directly or view it on GitHub.

benjaminbauer commented 10 years ago

@vieux works for me in Docker version 0.9.0, build 2b3fdf2

benclifford commented 10 years ago

I have encountered this in 0.9.0 running on a hetzner machine, once in the couple of days since I started using that machine. $ uname -v

25-Ubuntu SMP Thu Jan 30 17:22:01 UTC 2014

crosbymichael commented 10 years ago

Can someone try reproducing on master? We moved the bind mounts into the the mount namespace of the container so that when the container exists the kernel handles cleaning up the mounts for the container.

lcarstensen commented 10 years ago

The mount namespace changes were done all before 0.9.1, correct? On 0.9.1 I'm still getting occasional failures and deleted resolv.conf bind mounts leftover (see http://paste.fedoraproject.org/91665/96618703/) I can umount them and the tied up root mounts by hand, stop docker, dmsetup remove the pools, restart, and then "docker rm" cleanly.

crosbymichael commented 10 years ago

@lcarstensen no, it is currently in master. We did not want to make that change in a point release. ( not 0.9.1 )

vincentwoo commented 10 years ago

On 0.9.1 I am still very regularly getting stuff in my docker log like:

[error] server.go:85 HTTP Error: statusCode=500 Cannot destroy container f014d1aa98d68a9c3d10e3f260f596e5248225898cee5812e007be7abb3eb512: Driver aufs failed to remove root filesystem f014d1aa98d68a9c3d10e3f260f596e5248225898cee5812e007be7abb3eb512: rename /var/lib/docker/aufs/mnt/f014d1aa98d68a9c3d10e3f260f596e5248225898cee5812e007be7abb3eb512 /var/lib/docker/aufs/mnt/f014d1aa98d68a9c3d10e3f260f596e5248225898cee5812e007be7abb3eb512-removing: device or resource busy

saidler commented 10 years ago

Any update on this? When should we expect this to be fixed? This is happening to me as well on Debian with docker v0.9.0.

mriehl commented 10 years ago

Hi @vincentwoo @saidler if you'd care to read the comment from @crosbymichael you'd see that it is expected to be fixed on current master (don't know if it made it in 0.10.0). @crosbymichael I'm running docker from master and I have not been able to replicate the issue since.

vincentwoo commented 10 years ago

@mriehl Thanks. Have you found hot-swapping out the binary for a built one as described here: http://docs.docker.io/en/latest/contributing/devenvironment/ to work stably on ubuntu?

mriehl commented 10 years ago

@vincentwoo Yeah, that's how I 'upgraded'

lcarstensen commented 10 years ago

So, I'm a little late in the testing I promised. With the simple script in the gist below I tested with 0.10.0 and with a current master build and compared with 0.9.1 on RHEL 6.5 current w/ devicemapper. I confirmed in my environments that on 0.9.1 I'd always end up with containers that couldn't be removed but with 0.10.0 on a clean system (no devicemapper devices or mounts, clean /var/lib/docker) it performed significantly better. Under load / on slow VM's or systems or if I tighen up the timings much I'm still seeing "Error running removeDevice" errors with this test.

https://gist.github.com/lcarstensen/10513578

So...0.10.0 with devicemapper is definitely better.

mitchellh commented 10 years ago

Just wanted to add to the noise and say that I've been getting this with 0.10.0:

Client version: 0.10.0
Client API version: 1.10
Go version (client): go1.2.1
Git commit (client): dc9c28f
Server version: 0.10.0
Server API version: 1.10
Git commit (server): dc9c28f
Go version (server): go1.2.1
Last stable version: 0.10.0