lightbend / mesos-spark-integration-tests

Mesos Integration Tests on Docker/Ec2
16 stars 9 forks source link

Error response from daemon: Driver aufs failed to remove root filesystem #34

Closed skonto closed 8 years ago

skonto commented 8 years ago

Removing docker images sometimes reports this error: Stopping and removing master container(s)... Error response from daemon: Driver aufs failed to remove root filesystem 9ee91a4feb07e427cefca8f1be9bee36c337e9d7284c7904c42e4dcb6a762c7b: rename /var/lib/docker/aufs/mnt/9ee91a4feb07e427cefca8f1be9bee36c337e9d7284c7904c42e4dcb6a762c7b /var/lib/docker/aufs/mnt/9ee91a4feb07e427cefca8f1be9bee36c337e9d7284c7904c42e4dcb6a762c7b-removing: device or resource busy Error: failed to remove containers: [9ee91a4feb07]

See: https://ci.typesafe.com/job/spark-mesos-integration-tests-docker-nightly/17/console It is related with aufs not clear if it is because volume loading or something else. https://github.com/docker/docker/issues/9665

I can could be solved either by:

  1. kernel upgrade
  2. change docker backend eg. devicemapper (https://docs.docker.com/engine/userguide/storagedriver/selectadriver/) I would try https://docs.docker.com/engine/userguide/storagedriver/device-mapper-driver/ "Device Mapper has been included in the mainline Linux kernel since version 2.6.9. It is a core part of RHEL family of Linux distributions. This means that the devicemapper storage driver is based on stable code that has a lot of real-world production deployments and strong community support." Your backend also affects speed, this choice may speed up things a bit and see less disk bottleneck when tests run. There are advantages over aufs there. See how it compares with others http://stackoverflow.com/questions/24736778/why-is-the-docker- vfs-storage-backend-not-considered-suitable-for-production

https://docs.docker.com/engine/userguide/storagedriver/btrfs-driver/

"However, at the time of writing, the devicemapper storage driver should be considered safer, more stable, and more production ready. You should only consider the btrfs driver for production deployments if you understand it well and have existing experience with Btrfs."

skonto commented 8 years ago

Now tests run without docker container clean-up error.

https://ci.typesafe.com/job/spark-mesos-integration-tests-docker-nightly/23/console https://ci.typesafe.com/job/spark-mesos-integration-tests-docker-nightly/24/console

Upgraded the kernel (now 3.19.0-39-generic from 3.13) and built an image, which is the faster solution: sudo apt-get install linux-image-generic-lts-vivid linux-headers-generic-lts-vivid http://askubuntu.com/questions/598483/how-can-i-use-kernel-3-19-in-14-04-now Would be interesting to evaluate the devicemapper vs aufs case...

MrCoder commented 8 years ago

Getting the same error with devicemapper.

webdev2:~ # docker version
Client:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   f4bf5c7
 Built:
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   f4bf5c7
 Built:
 OS/Arch:      linux/amd64
webdev2:~ # docker ps -a
CONTAINER ID        IMAGE                       COMMAND             CREATED             STATUS              PORTS               NAMES
6efab667b88d        openshift/origin-pod:v1.1   "/pod"              7 days ago          Dead
webdev2:~ # docker rm 6ef
Error response from daemon: Cannot destroy container 6ef: Driver devicemapper failed to remove root filesystem 6efab667b88d8aa1ad4e4a89582f3d21a4f4f6fba2dc9df018eb711c34df1ed6: Device is Busy
Error: failed to remove containers: [6ef]