moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.56k stars 18.64k forks source link

0.7.0 fails to remove containers #2714

Closed ndarilek closed 10 years ago

ndarilek commented 10 years ago

Script started on Fri 15 Nov 2013 04:28:56 PM UTC root@thewordnerd:~# uname -a Linux thewordnerd.info 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux root@thewordnerd:~# docker version Client version: 0.7.0-rc5 Go version (client): go1.2rc4 Git commit (client): 0c38f86-dirty Server version: 0.7.0-rc5 Git commit (server): 0c38f86-dirty Go version (server): go1.2rc4 Last stable version: 0.6.6, please update docker root@thewordnerd:~# docker rm docker ps -a -q Error: Cannot destroy container ba8a9ec006c8: Driver devicemapper failed to remove root filesystem ba8a9ec006c8e38154bd697b3ab4810ddb5fe477ed1cfb48ac3bd604a5a59495: Error running removeDevice Error: Cannot destroy container d2f56763e65a: Driver devicemapper failed to remove root filesystem d2f56763e65a66ffccb3137017dddad745e921f4bdaa084f6b4a0d6407ec030a: Error running removeDevice Error: Cannot destroy container c22980febe50: Driver devicemapper failed to remove root filesystem ...

crosbymichael commented 10 years ago

Did you switch drivers from aufs to deviemapper manually without removing /var/lib/docker ?

ndarilek commented 10 years ago

Not that I'm aware of. How would I find out?

PierreR commented 10 years ago

As a note I have had the exact same problem.

docker version
Client version: 0.7.0
Go version (client): go1.2rc5
Git commit (client): 0d078b6
Server version: 0.7.0
Git commit (server): 0d078b6
Go version (server): go1.2rc5
Last stable version: 0.7.0

I have rebooted the host OS and the problem disappeared. It has happened after a docker kill or docker stop (I don't remember) on the container.

ghristov commented 10 years ago

I have the same problem and it appears on docker kill also on docker stop. Actually the problem according to me is that when mounted the container , when deleting the driver doesn't want to unmount it . Well depends whose responsibility it is( rm or kill/stop).

Indeed the problem is fixed after restart because everything is unmounted. and you have no locked situations.

philips commented 10 years ago

I am encountering this with 0.7.1 also

philips commented 10 years ago

Hrm, and switching to the device mapper backend doesn't really help either. Got this just now:

Error: Cannot destroy container keystone-1: Driver devicemapper failed to remove root filesystem 1d42834e2e806e0fd0ab0351ae504ec9a98e0a74be337fc2158a516ec8d6f36b: Error running removeDevice
philips commented 10 years ago

@crosbymichael It seems like this isn't just about aufs. devicemapper is getting similar errors. https://github.com/dotcloud/docker/issues/2714#issuecomment-30481156

zhemao commented 10 years ago

I'm getting this still on 0.7.3 using devicemapper

Client version: 0.7.3
Go version (client): go1.2
Git commit (client): 8502ad4
Server version: 0.7.3
Git commit (server): 8502ad4
Go version (server): go1.2
Last stable version: 0.7.3

However, the problem seems to resolve itself if you restart the docker server. If it happens again, I'll try running lsof on the mount to see what process is causing it to be busy.

Chris00 commented 10 years ago

I have the same problem.

$ docker version
Client version: 0.7.3
Go version (client): go1.2
Git commit (client): 8502ad4
Server version: 0.7.3
Git commit (server): 8502ad4
Go version (server): go1.2
Last stable version: 0.7.3
$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                CREATED             STATUS              PORTS               NAMES
538ab4938d5d        3c23bb541f74        /bin/sh -c apt-get -   12 minutes ago      Exit 100                                agitated_einstein   
bdfbff084c4d        3c23bb541f74        /bin/sh -c apt-get u   14 minutes ago      Exit 0                                  sharp_torvalds      
95cea6012869        6c5a63de23d9        /bin/sh -c echo 'for   14 minutes ago      Exit 0                                  romantic_lovelace 
$  mount|grep 538ab4938d5d
/dev/mapper/docker-8:3-2569260-538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278 on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278 type ext4 (rw,relatime,discard,stripe=16,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/.dockerinit type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/.dockerenv type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/etc/resolv.conf type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/etc/hostname type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/root on /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/etc/hosts type ext4 (rw,relatime,errors=remount-ro,data=ordered)
# lsof /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278
lsof: WARNING: can't stat() ext4 file system /opt/docker/devicemapper/mnt/95cea6012869809320920019f2a2732165915281b79538a84f3ee3adddcbc783/rootfs/.dockerinit (deleted)
      Output information may be incomplete.
lsof: WARNING: can't stat() ext4 file system /opt/docker/devicemapper/mnt/bdfbff084c4d96b6817eb7ccb812a608e4a6a45cb4c06d423e26364b45b59c97/rootfs/.dockerinit (deleted)
      Output information may be incomplete.
lsof: WARNING: can't stat() ext4 file system /opt/docker/devicemapper/mnt/538ab4938d5d0f2e4ccb66b1410b57c8923fd7881551e365ffc612fe629ac278/rootfs/.dockerinit (deleted)
      Output information may be incomplete.
# ls -l /opt/docker/devicemapper/mnt/95cea6012869809320920019f2a2732165915281b79538a84f3ee3adddcbc783/rootfs/.dockerinit
-rwx------ 0 root root 14406593 Jan  4 21:05 /opt/docker/devicemapper/mnt/95cea6012869809320920019f2a2732165915281b79538a84f3ee3adddcbc783/rootfs/.dockerinit*
Chris00 commented 10 years ago

Restarting the deamon does not solve the problem.

limboy commented 10 years ago

same problem:

limboy@gintama:~$ docker ps -a 
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
a7760911ecac        ubuntu:12.04        bash                About an hour ago   Exit 137                                backstabbing_mccarthy   

limboy@gintama:~$ docker rm a77
Error: Cannot destroy container a77: Driver devicemapper failed to remove root filesystem a7760911ecacb93b1c530d6a0bde4deeb79ef0cbf901488cb55df2f2ca02207a: device or resource busy
2014/01/05 16:04:21 Error: failed to remove one or more containers

limboy@gintama:~$ docker info
Containers: 1
Images: 5
Driver: devicemapper
 Pool Name: docker-202:0-93718-pool
 Data file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata file: /var/lib/docker/devicemapper/devicemapper/metadata
 Data Space Used: 1079.8 Mb
 Data Space Total: 102400.0 Mb
 Metadata Space Used: 1.3 Mb
 Metadata Space Total: 2048.0 Mb
WARNING: No memory limit support
WARNING: No swap limit support

restart the host doesn't solve the problem.

then i run docker run -i ubuntu bash can't goto interactive mode, just blank.

ptmt commented 10 years ago

+1.

$ docker version
Client version: 0.7.3
Go version (client): go1.2
Git commit (client): 8502ad4
Server version: 0.7.3
Git commit (server): 8502ad4
Go version (server): go1.2
Last stable version: 0.7.3

$ docker rm d33
2014/01/07 05:55:57 DELETE /v1.8/containers/d33
[error] mount.go:11 [warning]: couldn't run auplink before unmount: exit status 116
[error] api.go:1062 Error: Cannot destroy container d33: Driver aufs failed to remove root filesystem d3312bcdeb7dc241d4
870100beadfe94d6884904229cc50d66aacd66ab16e064: stale NFS file handle
[error] api.go:87 HTTP Error: statusCode=500 Cannot destroy container d33: Driver aufs failed to remove root filesystem
d3312bcdeb7dc241d4870100beadfe94d6884904229cc50d66aacd66ab16e064: stale NFS file handle
Error: Cannot destroy container d33: Driver aufs failed to remove root filesystem d3312bcdeb7dc241d4870100beadfe94d68849
04229cc50d66aacd66ab16e064: stale NFS file handle
2014/01/07 05:55:57 Error: failed to remove one or more containers
vjeantet commented 10 years ago

same here

Client version: 0.7.5
Go version (client): go1.2
Git commit (client): c348c04
Server version: 0.7.5
Git commit (server): c348c04
Go version (server): go1.2
Last stable version: 0.7.5

$docker rm 9f017e610f24
2014/01/11 23:03:11 DELETE /v1.8/containers/9f017e610f24
[error] api.go:1064 Error: Cannot destroy container 9f017e610f24: Driver devicemapper failed to remove root filesystem 9f017e610f2401541558a93b5c3beafc2e20586c766dfe49e521bcdf878ebe3a: device or resource busy
[error] api.go:87 HTTP Error: statusCode=500 Cannot destroy container 9f017e610f24: Driver devicemapper failed to remove root filesystem 9f017e610f2401541558a93b5c3beafc2e20586c766dfe49e521bcdf878ebe3a: device or resource busy
Error: Cannot destroy container 9f017e610f24: Driver devicemapper failed to remove root filesystem 9f017e610f2401541558a93b5c3beafc2e20586c766dfe49e521bcdf878ebe3a: device or resource busy
2014/01/11 23:03:11 Error: failed to remove one or more containers
LordFPL commented 10 years ago

Same problem here with 0.7.5. "Resolved" with a lazy umount : for fs in $(cat /proc/mounts | grep '.dockerinit\040(deleted)' | awk '{print $2}' | sed 's/\/rootfs\/.dockerinit\040(deleted)//g'); do umount -l $fs; done

(or just the umount -l on the FS)

All the question is why some FS are in "/rootfs/.dockerinit\040(deleted) " state ?

joelmoss commented 10 years ago

I can confirm that this is an issue on 0.7.5

vjeantet commented 10 years ago

I don't know if it is related but : My docker data were in /var/lib/docker which was a symlink to /home/docker

/home is a mount point

Container's mount points on a symlink to a mount may be the cause ? Since I told docker to use /home/docker instead of /var/lib/docker I don't have this issue anymore.

LordFPL commented 10 years ago

I'm already using a different base directory. Problems may be coming when docker daemon is restarted without stop properly containers... there is a bad thing somewhere in the stop/start of a new docker start...

tianon commented 10 years ago

+1 I've got three containers on my devicemapper machine now that I can't remove because their devices fail to be removed in devicemapper (and none of them are even mounted in /proc/mounts)

Also, nothing in dmesg, and the only useful daemon output is highly cryptic and not very helpful:

[debug] deviceset.go:358 libdevmapper(3): ioctl/libdm-iface.c:1768 (-1) device-mapper: remove ioctl on docker-8:3-43647873-f4985ed89768280bb537b88d9d779699c6858c45217742ea5a598d6db95abb31 failed: Device or resource busy
[debug] devmapper.go:495 [devmapper] removeDevice END
[debug] deviceset.go:574 Error removing device: Error running removeDevice
[error] api.go:1064 Error: Cannot destroy container f4985ed89768: Driver devicemapper failed to remove root filesystem f4985ed89768280bb537b88d9d779699c6858c45217742ea5a598d6db95abb31: Error running removeDevice
[error] api.go:87 HTTP Error: statusCode=500 Cannot destroy container f4985ed89768: Driver devicemapper failed to remove root filesystem f4985ed89768280bb537b88d9d779699c6858c45217742ea5a598d6db95abb31: Error running removeDevice
mriehl commented 10 years ago

+1 @vjeantet setting the docker base directory in /etc/default/docker instead of using a symlinked /var/lib/docker fixed these problems for me.

SamSaffron commented 10 years ago

+1 seen this as well, quite easy to repro, recommending people only use aufs for now

mikesimons commented 10 years ago

As a workaround I managed to successfully remove a container stuck in this fashion by renaming the offending DM device (using dmsetup rename), executing dmsetup wipe_table <stuck_id>, restarting docker and re-running docker rm.

You need to use the full DM id of the device which is at the end of the error (e.g docker-8:9-7880790-bc945261c1f97e7145604a4248e2c84535fb204c8e214fa394448e0b2dcd064a ).

The stuck device also disappeared on reboot.

This was achieved after much messing about with dmsetup so it's plausible something I did in between was also required. YMMV but it worked for me.

Edit: Needed to restart docker and run wipe_table too

lgs commented 10 years ago

... same problem with Docker version 0.7.6, build bc3b2ec

lsoave@basenode:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
53a9a8c4e29c        8dbd9e392a96        bash                17 minutes ago      Exit 0                                  thirsty_davinci     
lsoave@basenode:~$ docker rm 53a9a8c4e29c
Error: Cannot destroy container 53a9a8c4e29c: Driver aufs failed to remove root filesystem 53a9a8c4e29c2c99fdd8d5355833f07eca69cbfbefcd02915e267517111fbde8: device or resource busy
2014/01/19 20:38:50 Error: failed to remove one or more containers
lsoave@basenode:~$ 

by re-booting the host and running docker rm 53a9a8c4e29c again it works. My env:

lsoave@basenode:~$ uname -a
Linux basenode 3.11.0-15-generic #23-Ubuntu SMP Mon Dec 9 18:17:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
lsoave@basenode:~$ docker -v
Docker version 0.7.6, build bc3b2ec
mikesimons commented 10 years ago

Happened again today; machine went in to suspend with docker containers running but did not come out of suspend cleanly. Needed a reboot.

Upon reboot the DM device for one of the containers that was running was stuck.

> uname -a
Linux mv 3.9.9-1-ARCH #1 SMP PREEMPT Wed Jul 3 22:45:16 CEST 2013 x86_64 GNU/Linux

Running docker 0.7.4 build 010d74e

lgs commented 10 years ago

@mikesimons ... did you remember the operational flow which brings you to the failure ?

kklepper commented 10 years ago

Solved

At least in my case -- look for yourself if you don't have the same cause.

I had created a MySQL server container myself, which worked fine. As I was puzzled about the size of the containers, I decided to create new MySQL containers based on the work of somebody else.

This was indeed very interesting, as I found that size may differ substantially even when the Dockerfile looks similar or even identical. For example, my first has nearly 700 MB:

kklepper/Ms           latest              33280c9a70a7        5 days ago          695.7 MB

The container based on dhrp/mysql is nearly half the size of mine, and it works equally good:

kklepper/mysqld       latest              49223549bf47        24 hours ago        359.8 MB

The 2nd example produced the above-mentioned error, I'll get to that in just a second.

When I tried to repeat my findings today, I got a lot more size with exactly the same Dockerfile, seemingly without reason:

kklepper/mysqlda      latest              6162b0c95e8c        2 hours ago         374.4 MB

It was no problem to remove this container as well.

The next example introduced the problem, based on the 2nd result of my search https://index.docker.io/search?q=mysql: brice/mysql

As I had enhanced his approach, I couldn't see right at the spot where the problem was, but diligent tracking down finally showed, that in this case the offense was the command

VOLUME ["/var/lib/mysql", "/var/log/mysql"]

in the Dockerfile, which I had spent no thought at.

Both directories exist in the container:

root@mysql:/# ls /var/lib/mysql
debian-5.5.flag  ib_logfile0  ib_logfile1  ibdata1  mysql  performance_schema  test  voxx_biz_db1
root@mysql:/# ls /var/log/mysql
error.log

But not in the host:

vagrant@precise64:~$ ls /var/lib/mysql
ls: cannot access /var/lib/mysql: No such file or directory
vagrant@precise64:~$ ls /var/log/mysql
ls: cannot access /var/log/mysql: No such file or directory

The VOLUME directive ties the volume of the container to the correspondent volume of the host (or rather the other way around).

Docker should throw an error if the directory does not exist in the host; by design it will "create" the directory in the container if it does not exist.

Unfortunately I'm not able to write a patch, but I'm sure many of you can.

pwaller commented 10 years ago

@kklepper, I'm misunderstanding the relationship between the issue and your post. From what I read your "issue" was that you overlooked the behaviour of the VOLUME directive, but the issue at hand is that docker rm won't actually remove a stopped container in some circumstances, so I don't see in any sense how this issue is solved?

kklepper commented 10 years ago

Sorry for the confusion, I should have clarified that the VOLUME error caused the docker rm error, exactly as reported above. I found this thread because I searched for exactly this error message. Obviously nobody was able to track the conditions down yet.

lgs commented 10 years ago

@kklepper thanks for detailed report.

Can you print on this board the Dockerfile which produce our object fault please ?

I was looking for you on the pubbic index but no kklepper user found over there. Then, no way to me to reproduce your containers :

kklepper/Ms           latest              33280c9a70a7        5 days ago          695.7 MB
kklepper/mysqld       latest              49223549bf47        24 hours ago        359.8 MB
kklepper/mysqlda      latest              6162b0c95e8c        2 hours ago         374.4 MB

I'd like having a test on my own about what you're saying, because in my understanding of docker's VOLUME, it shouldn't need the same path hosting side. They should be just mount points.

Moreover unfortunately, by my side I cannot remember the things I did to get to the point I received that Cannot destroy container ... I mencioned before.

That's way I was asking @mikesimons for the steps he went through.

kklepper commented 10 years ago

@lgs Well, here it goes -- you will see some experimentation along the lines; actually none of this has anything to do with the problem the thread started with except the aside I included at the end.

Ms relies on scripts which manipulate MySQL and is derived from Supervisor which in turn is derived from quintenk/supervisor which installs Supervisor -- I just manipulate time information.

I first invoked Supervisor because at the time this seemed to be the only way so far for me to start Apache as a background container. Later, I found that I could do without, so it is no longer needed, but for completeness sake, I show it here anyway.

Supervisor can restart services automatically; after having had some experiences with this feature, I am not all that sure that this is a good idea.

Supervisor

FROM quintenk/supervisor
# 333 (original)/401 MB (mine)

MAINTAINER kklepper <gmcgmx@googlemail.com>
# Update the APT cache
RUN sed -i.bak 's/main$/main universe/' /etc/apt/sources.list
RUN apt-get update
RUN apt-get -y upgrade

# set timezone
RUN echo "Europe/Berlin" |  tee /etc/timezone
run dpkg-reconfigure --frontend noninteractive tzdata

WORKDIR /var/www

Ms

Here you see pwgen for automatic password generation; I certainly learned something, but don't know if this is a good idea as such either.

Disabling ENTRYPOINT shows that this Dockerfile can be used to produce a container running in the background as well as a container with a shell for detailed inspection from this same Dockerfile.

FROM kklepper/Supervisor
# 695 MB

MAINTAINER kklepper <gmcgmx@googlemail.com>

RUN apt-get update
RUN apt-get -y upgrade

RUN dpkg-divert --local --rename --add /sbin/initctl
RUN ln -s /bin/true /sbin/initctl

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install mysql-server pwgen

ADD ./start.sh /start.sh
ADD ./supervisord.conf /etc/supervisor/conf.d/supervisord.conf

RUN chmod 755 /start.sh

EXPOSE 3306

CMD ["/bin/bash", "/start.sh", "&"]

#ENTRYPOINT ["/start.sh", "&"]

mysqld

Here the network instruction in mysql.conf is manipulated more intelligently than in my start.sh script. Also, the command to invoke the MySQL server is different. I use the mysqld_safe script out of habit without having a look at it. My application runs just as well with this version, saving anout 300 MB.

FROM dhrp/mysql
# 360 MB
MAINTAINER kklepper <gmcgmx@googlemail.com>

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install pwgen

RUN sed -i -e 's/127.0.0.1/0.0.0.0/' /etc/mysql/my.cnf
# to make sure we can connect from the network

ADD ./start.sh /start.sh
RUN chmod 755 /start.sh
RUN /start.sh
# will set privileges

EXPOSE 3306

CMD ["sh", "-c", "mysqld"]

mysqlda

This should be exactly the same, but the image consumes more memory nevertheless.

FROM dhrp/mysql
# 374 MB
MAINTAINER kklepper <gmcgmx@googlemail.com>

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install pwgen

RUN sed -i -e 's/127.0.0.1/0.0.0.0/' /etc/mysql/my.cnf
# to make sure we can connect from the network

ADD ./start.sh /start.sh
RUN chmod 755 /start.sh
RUN /start.sh
# will set privileges

EXPOSE 3306

CMD ["sh", "-c", "mysqld"]

start.sh

Mostly taken from somewhere else, advanced from there, for example with debugging information.

As you can see, I make sure that I see immediately what is going on when I enter the container with a shell; otherwise I can consult the log.

In my application, I connect via reader to do some tests.

#!/bin/bash

if [ ! -f /mysql-del.sql ]; then
    sed -i 's/bind-address/#bind-address/'  /etc/mysql/my.cnf
    # to be able to connect from the network

    /usr/bin/mysqld_safe &
    # start the server

    sleep 10s
    # give it some time

    MYSQL_PASSWORD=`pwgen -c -n -1 12`
    # generate random password

    echo -------------------
    echo mysql root password: $MYSQL_PASSWORD
    echo -------------------
    echo $MYSQL_PASSWORD > /mysql-root-pw.txt
    mysqladmin -uroot password $MYSQL_PASSWORD
    # use mysqladmin to set the password for root
    echo "14------------------- mysqladmin -uroot password $MYSQL_PASSWORD"

    PRIV="GRANT ALL PRIVILEGES ON *.* TO  'root'@'172.17.%' IDENTIFIED BY '$MYSQL_PASSWORD';"
    echo $PRIV > /mysql-grant.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-grant.sql
    # make sure you can connect from within our network
    echo 21------------------- $PRIV

    PRIV="DELETE FROM mysql.user WHERE password = '';FLUSH PRIVILEGES;"
    echo $PRIV > /mysql-del.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-del.sql
    # get rid of users without password (this is not really necessary, as we are in a safe box anyway)
    echo 26------------------- $PRIV

    PRIV="SELECT user, host, password FROM mysql.user;"
    echo $PRIV > /mysql-test.sql
    echo 30------------------- $PRIV
    echo ===============================================================================
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-test.sql
    # let's see if everything worked fine so far, just to test our approach
    echo ===============================================================================

    VX_PASSWORD=thisisnotnice
    PRIV="GRANT SELECT ON db1.* TO  'reader'@'172.17.%' IDENTIFIED BY '$VX_PASSWORD';"
    echo $PRIV > /mysql-reader.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-reader.sql
    # we will use this user for reading
    echo 39------------------- $PRIV

    CRUD_PASSWORD=somethingelse
    PRIV="GRANT INSERT, SELECT, UPDATE, DELETE ON voxx_biz_db1.* TO  'crud'@'172.17.%' IDENTIFIED BY '$CRUD_PASSWORD';"
    echo $PRIV > /mysql-crud.sql
    mysql -uroot -p$MYSQL_PASSWORD < /mysql-crud.sql
    # we might need to use it for manipulation
    echo 45------------------- $PRIV

    killall mysqld
    sleep 10s

fi

mysqld_safe &
# reload privileges

different approach -- as an aside

While searching for the offending Dockerfile, I stumbled across somebody who used that same container, but was wise enough to delete that line with the VOLUME instruction, although not giving an explanation for this decision -- we can safely assume that he stumbled into the same problem as all of us and found the solution for himself: https://www.google.de/search?q=%22MAINTAINER+Brandon+Rice%22+mysql

He had a different problem which I inspect at the moment. For this kind of testing, I use his approach which has a lot of appeal to me (I wasn't aware of the -e option):

mysql -e "\
UPDATE mysql.user SET password = password('thisismypassword') WHERE user = 'root';\
FLUSH PRIVILEGES;\
DELETE FROM mysql.user WHERE password = '';\
FLUSH PRIVILEGES;\
GRANT ALL ON *.* to 'root'@'172.17.%' IDENTIFIED BY 'thisismypassword'; \
GRANT SELECT ON voxx_biz_db1.* TO  'reader'@'172.17.%' IDENTIFIED BY 'thisisnotnice';\
GRANT INSERT, SELECT, UPDATE, DELETE ON voxx_biz_db1.* TO  'crud'@'172.17.%' IDENTIFIED BY 'somethingelse';\
FLUSH PRIVILEGES;\
"
# a lot of instructions at once, nice to read

The problem he had came with a DROP DATABASE command.

ERROR 6 (HY000) at line 1: Error on delete of './my_db//db.opt' (Errcode: 1)

I can confirm that I get a similar error both on the Dockerfile and manually in the MySQL client:

ERROR 1010 (HY000): Error dropping database (can't rmdir './test', errno: 1)

Now what does that mean? Our MySQL-container tells us:

root@mysql:/# perror 6
OS error code   6:  No such device or address

root@mysql:/# perror 1010
MySQL error code 1010 (ER_DB_DROP_RMDIR): Error dropping database (can't rmdir '%-.192s', errno: %d)

Unfortunately, he doesn't tell us anything about the database he is trying to drop. My problem, upon which I stumbled by chance here, obviously is different.

I just tested the hypothesis that my problem might stem from using a volume from another container, so I dropped this link to my fully populated database, but the error is the same.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.00 sec)

mysql> drop database test;
ERROR 1010 (HY000): Error dropping database (can't rmdir './test', errno: 1)
mysql> create database test2;
Query OK, 1 row affected (0.02 sec)

mysql> drop database test2;
Query OK, 0 rows affected (0.00 sec)

Now this is funny. Obviously I have no problem to drop freshly created databases. I remember from earlier times, that it was no problem to drop the installation-standard empty database test -- this was one of the first measures to take.

It would be interesting to test if I could drop a linked-in database, but I will postpone that for later -- I should make sure that I have a backup of my living database if that test succeeds.

MySQL versus memcached

For my little web-application test, I query a simple key-value-table with a MySQL-compressed value of about 70 kB, uncompressed 680 kB in my app, called from XP/Firefox, and compare this with a memcached database in another container, also linked to the Apache/PHP-container, all residing in a Vagrant/Virtualbox. The additional compression for memcached is done with PHP. I don't know yet if memcached can do that on its own (yes, it can: Memcached::OPT_COMPRESSION).

Interestingly, when I built this Apache/PHP-container about a week ago, I could apt-get php5-memcached, but when I wanted to do another test with my Apache/PHP-setup yesterday, ubuntu couldn't find that package anymore -- in fact I (or Google) couldn't find it nowhere. How come?

Certainly I wasn't dreaming, the line

RUN DEBIAN_FRONTEND=noninteractive apt-get -y install apache2 libapache2-mod-php5 python-setuptools nano php5-mysql php5-memcached

worked out with no problems at all last week.

~~MySQL have_query_cache: YES~~
~~Did mysql_query in 36.43 milliseconds~~
~~Did memcached in 1.25 milliseconds~~
~~Result: memcached appr. 30 times faster: 35.18 milliseconds saved: 36.43 :: 1.25~~

~~Did Memcached zipped in 0.76 milliseconds~~
~~Result: memcached zipped / mysql appr. 48 times faster: 35.67 milliseconds saved: 36.43 :: 0.76~~
~~Result: memcached / memcached zipped appr. 2 times faster: 0.49 milliseconds saved: 1.25:: 0.76~~

Sorry, had an error in my code. This is correct:

have_query_cache: YES
Did mysql_query in 40.69 milliseconds
Did memcached in 13.95 milliseconds
Result: memcached 2.92 times faster: 26.74 milliseconds saved: 40.69 :: 13.95

Did Memcached zipped in 10.27 milliseconds
Result: memcached zipped mysql 3.96 times faster: 30.42 milliseconds saved: 40.69 :: 10.27
Result: memcached/zipped 1.36 times faster: 3.68 milliseconds saved: 13.95 :: 10.27

I never did a test in this direction before and was surprised about the result.

My setup with 3 running and 2 data containers looks very promising.

I plan to look at other databases, set up a MySQL replication scheme, a load-balancing scheme with several Apache containers, and after that look into the problem of contacting containers residing in other machines, virtual or not.

Docker definitely rocks.

mikesimons commented 10 years ago

@lgs I can 100% reproduce on my machine simply by starting a container (all containers I've tried have exhibited this behaviour thus far) and forcing a shutdown:

joelmoss commented 10 years ago

FYI, I get no such issues with <= 0.7.2

kklepper commented 10 years ago

Ok, it boils down to this:

Dockerfile

#FROM busybox 
# ok
FROM ubuntu
# Error: Cannot destroy container test: Driver aufs failed to remove init filesystem f27ab92e572681b81aecd30b6b03a67613c092055a8ef973f9e5450e941afbba-init: invalid argument

VOLUME ["/var/lib/not_exit"]

busybox

Shell 1:

vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker build -t kklepper/test .
vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker run -i -t -rm -h test -name test kklepper/test ash

Shell 2:

vagrant@precise64:~$ id=test
vagrant@precise64:~$ docker stop $id && docker rm $id
test
test

ubuntu

Shell 1:

vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker build -t kklepper/test .
vagrant@precise64:/vagrant/docker-lampstack-master/test$ docker run -i -t -rm -h test -name test kklepper/test ash

Shell 2:

vagrant@precise64:~$ docker stop $id && docker rm $id
test
Error: Cannot destroy container test: Driver aufs failed to remove init filesystem f27ab92e572681b81aecd30b6b03a67613c092055a8ef973f9e5450e941afbba-init: invalid argument
2014/01/22 10:10:37 Error: failed to remove one or more containers

The trick with the boilerplate from Brandon was probably that he had a MySQL database sitting in the volume he used, whereas other people, like me, might not. As you usually want to use a volume you already have, this problem shouldn't occur too often.

lgs commented 10 years ago

... it seems something spreading many facets:

https://github.com/search?q=%22remove+containers%22+docker&type=Issues&ref=searchresults

inthecloud247 commented 10 years ago

seeing same issue with 0.7.6 on ubuntu 13.10.

i'm running this on a disk with full-disk encryption (ecryptfs) enabled under ubuntu. maybe that's the issue here?

sameersbn commented 10 years ago

@ydavid365 don't think its got any thing to do with ecryptfs. I am seeing this issue since 0.7.3 on ubuntu 13.10 as well as 12.04.

lgs commented 10 years ago

@ydavid365 the majority of reported cases are not on encrypted environments, then I would exclude ecryptfs is causing any issue here ...

abelaska commented 10 years ago

+1

inthecloud247 commented 10 years ago

think it has to do with bind mounts from disks that are mounted in a specific way. (some kind of symlink or reference issue). full-disk encrypted disks have a similar issue. when I bind to a folder not on the disk with full-disk encryption enabled, it cleans up properly when I delete the container.

On Thu, Jan 30, 2014 at 4:28 AM, Alois Bělaška notifications@github.comwrote:

+1

— Reply to this email directly or view it on GitHubhttps://github.com/dotcloud/docker/issues/2714#issuecomment-33683208 .

ndarilek commented 10 years ago

I have a reliable way to duplicate this.

First, check out this Git repository. You'll need Vagrant and Ansible installed. Once done: vagrant up This will spin up a Ubuntu 13.10 VM. Ideally it should also start Skydns/Skydock, but it doesn't, probably because I don't fully understand Upstart. But I reliably fail to docker rm these containers due to devicemapper issues. I've destroyed and recreated this VM many times with the same results.

Note that sometimes "vagrant up" fails, and I have to run "vagrant provision" an extra time or two. This seems like it may be a race condition or network flakiness on my end, not an Ansible/Vagrant/Docker failure.

Hope this helps chase down the problem. This issue is annoying.

spez commented 10 years ago

I'm pretty new to docker, but have been wrestling with this issue using the new docker 0.8.0 + boot2docker + os x combo.

In the out of the box setup, docker's root path is /var/lib/docker, which is a symlink to /mnt/sda1/var/lib/docker

Running docker with -g /mnt/sda1/var/lib/docker solved the issue for me.

inthecloud247 commented 10 years ago

Whoah good tip!

Ya I had this problem on an encrypted dd-crypt volume. I solved the problem by creating a zfs volume and mounting that at /var/lib/docker instead. Had to modify the docker startup sequence a bit to get it to work on reboot, but works fine now.

On Feb 6, 2014, at 11:07 PM, Steve Huffman notifications@github.com wrote:

I'm pretty new to docker, but have been wrestling with this issue using the new docker 0.8.0 + boot2docker + os x combo.

In the out of the box setup, docker's root path is /var/lib/docker, which is a symlink to /mnt/sda1/var/lib/docker

Running docker with -g /mnt/sda1/var/lib/docker solved the issue for me.

— Reply to this email directly or view it on GitHub.

johnae commented 10 years ago

Oh man, on 0.8.0 my troubles are back. I used to be able to run:

docker ps -a -q | xargs docker rm

But not anymore, now I just get:

Error: container_delete: Cannot destroy container 0e769092d641: Driver devicemapper failed to remove root filesystem 0e769092d6413cfc9c426dc20d006d1451b249db3d5a2e43ce4ff53206c59306: hash 0e769092d6413cfc9c426dc20d006d1451b249db3d5a2e43ce4ff53206c59306 doesn't exists
2014/02/07 18:23:05 Error: failed to remove one or more containers
inthecloud247 commented 10 years ago

is that for all volumes, or only ones created pre 0.8.0 ?

On Fri, Feb 7, 2014 at 10:26 AM, John Axel Eriksson < notifications@github.com> wrote:

Oh man, on 0.8.0 my troubles are back. I used to be able to run: docker ps -a -q | xargs docker rm

But not anymore, now I just get:

Error: container_delete: Cannot destroy container 0e769092d641: Driver devicemapper failed to remove root filesystem 0e769092d6413cfc9c426dc20d006d1451b249db3d5a2e43ce4ff53206c59306: hash 0e769092d6413cfc9c426dc20d006d1451b249db3d5a2e43ce4ff53206c59306 doesn't exists 2014/02/07 18:23:05 Error: failed to remove one or more containers

— Reply to this email directly or view it on GitHubhttps://github.com/dotcloud/docker/issues/2714#issuecomment-34483612 .

johnae commented 10 years ago

This is in a vagrant vm just created with 0.8.0 installed on provisioning. The image was downloaded after installing 0.8.0 and run after installing 0.8.0.

johnae commented 10 years ago

If I try running more of the same image (which fail to boot on 0.8.0 for some reason) I end up in the same situation with all of them - can't remove stopped containers.

inthecloud247 commented 10 years ago

now I'm scared to upgrade to 0.8.0 :-(

anyone else seeing this?

On Fri, Feb 7, 2014 at 10:49 AM, John Axel Eriksson < notifications@github.com> wrote:

If I try running more of the same image (which fail to boot on 0.8.0 for some reason) I end up in the same situation with all of them - can't remove stopped containers.

— Reply to this email directly or view it on GitHubhttps://github.com/dotcloud/docker/issues/2714#issuecomment-34487121 .

peter1000 commented 10 years ago

0.8.0 same problem ubuntu 13.10

I used a symlink for /var/lib/docker I removed that and changed /etc/default/docker

DOCKER_OPTS="-g /home/docker"

it seems to work

QUESTION: anyone knows if the /etc/udev/rules.d/80-docker.rules have to be adjusted as there is one entry which refers to /var/lib/docker/ ATTR{loop/backing_file}=="/var/lib/docker/*"

# hide docker's loopback devices from udisks, and thus from user desktops
SUBSYSTEM=="block", ENV{DM_NAME}=="docker-*", ENV{UDISKS_PRESENTATION_HIDE}="1", ENV{UDISKS_IGNORE}="1"
SUBSYSTEM=="block", DEVPATH=="/devices/virtual/block/loop*", ATTR{loop/backing_file}=="/var/lib/docker/*", ENV{UDISKS_PRESENTATION_HIDE}="1", ENV{UDISKS_IGNORE}="1"

just asking cause I only installed ubuntu 13.10 to test docker

inthecloud247 commented 10 years ago

awesome! so it does seem to be narrowing down to an issue with symlinks.

may be helpful should start mentioning in the bug reports what file system is being used. so i'll start:

DID_NOT_WORK:

WORKED:

On Fri, Feb 7, 2014 at 12:34 PM, peter1000 notifications@github.com wrote:

0.8.0 same problem ubuntu 13.10

I used a symlink for /var/lib/docker I removed that and changed /etc/default/docker

DOCKER_OPTS="-g /home/docker"

it seems to work

QUESTION: anyone knows if the /etc/udev/rules.d/80-docker.rules have to be adjusted as there is one entry which refers to /var/lib/docker/ ATTR{loop/backing_file}=="/var/lib/docker/*"

hide docker's loopback devices from udisks, and thus from user desktops

SUBSYSTEM=="block", ENV{DMNAME}=="docker-", ENV{UDISKS_PRESENTATION_HIDE}="1", ENV{UDISKSIGNORE}="1" SUBSYSTEM=="block", DEVPATH=="/devices/virtual/block/loop", ATTR{loop/backing_file}=="/var/lib/docker/*", ENV{UDISKS_PRESENTATION_HIDE}="1", ENV{UDISKS_IGNORE}="1"

just asking cause I only installed ubuntu 13.10 to test docker

— Reply to this email directly or view it on GitHubhttps://github.com/dotcloud/docker/issues/2714#issuecomment-34499347 .

johnae commented 10 years ago

So, I just booted a fresh vagrant ubuntu 13.10, installed docker 0.8.0 and pulled stackbrew/ubuntu:13.10.

Seems the problem is the same with this image, here's how to reproduce:

docker pull stackbrew/ubuntu:13.10
docker run -t -i stackbrew/ubuntu:13.10 bash
exit
docker ps -a
CONTAINER ID        IMAGE                            COMMAND             CREATED              STATUS              PORTS               NAMES
2f6bfa064dca        stackbrew/ubuntu:13.10           bash                4 seconds ago        Exit 0                                  kickass_thompson

docker rm 2f6bfa064dca
Error: container_delete: Cannot destroy container 2f6bfa064dca: Driver devicemapper failed to remove root filesystem 2f6bfa064dca885f3d1896642bb964cda7caf1ab9d40a1a18869456b0900b9ef: Error running removeDevice
2014/02/08 15:47:53 Error: failed to remove one or more containers

and by running the daemon in debug mode I see this in the logs:

[debug] api.go:933 Calling DELETE /containers/{name:.*}
2014/02/08 15:47:52 DELETE /v1.9/containers/2f6bfa064dca
[/var/lib/docker|dc00726f] +job container_delete(2f6bfa064dca)
[debug] deviceset.go:212 activateDeviceIfNeeded(2f6bfa064dca885f3d1896642bb964cda7caf1ab9d40a1a18869456b0900b9ef)
[debug] devmapper.go:509 [devmapper] removeDevice START
[debug] deviceset.go:358 libdevmapper(3): ioctl/libdm-iface.c:1768 (-1) device-mapper: remove ioctl on docker-8:1-264222-2f6bfa064dca885f3d1896642bb964cda7caf1ab9d40a1a18869456b0900b9ef failed: Device or resource busy
[debug] devmapper.go:519 [devmapper] removeDevice END
[debug] deviceset.go:583 Error removing device: Error running removeDevice

Cannot destroy container 2f6bfa064dca: Driver devicemapper failed to remove root filesystem 2f6bfa064dca885f3d1896642bb964cda7caf1ab9d40a1a18869456b0900b9ef: Error running removeDevice[/var/lib/docker|dc00726f] -job container_delete(2f6bfa064dca) = ERR (1)
[error] api.go:959 Error: container_delete: Cannot destroy container 2f6bfa064dca: Driver devicemapper failed to remove root filesystem 2f6bfa064dca885f3d1896642bb964cda7caf1ab9d40a1a18869456b0900b9ef: Error running removeDevice
[error] api.go:91 HTTP Error: statusCode=500 container_delete: Cannot destroy container 2f6bfa064dca: Driver devicemapper failed to remove root filesystem 2f6bfa064dca885f3d1896642bb964cda7caf1ab9d40a1a18869456b0900b9ef: Error running removeDevice
ghost commented 10 years ago

@johnae Confirm this, I have exact the same problem .