apache / openwhisk

Apache OpenWhisk is an open source serverless cloud platform
https://openwhisk.apache.org/
Apache License 2.0
6.5k stars 1.16k forks source link

invoker mounts /var/lib/docker/containers - prevents docker rm from working correctly #199

Closed domdom82 closed 7 years ago

domdom82 commented 8 years ago

When invoker is launched, it mounts a volume as

/var/lib/docker/containers:/containers

which is the folder where docker keeps filesystems of all its containers.

I presume this is used for the invoker to access the filesystems of launched actions.

However, when trying to remove other containers (like registrator, consul etc.) you get the following error messages:

Docker API Error: Unable to remove filesystem for 57f62f43c6280934c44be469f33709498ac82dd6c960ea5a2bf02ac924db502e: remove /var/lib/docker/containers/57f62f43c6280934c44be469f33709498ac82dd6c960ea5a2bf02ac924db502e/mqueue: device or resource busy"
domdom82 commented 8 years ago

Recent discussion has brought up the problems with fetching logs from user containers making this a non-trivial issue.

Current ideas for mitigation would be:

dfederschmidt commented 8 years ago

Some thoughts on this from a discussion with @domdom82.

This should prevent the error observed. As a step forward on this issue, I will try to reproduce the error in a minimal working example as a playground for experimenting with different approaches to make fast log extraction possible without mounting /var/lib/docker/containers.

perryibm commented 8 years ago

Some thoughts on this:

  1. Why doesn't the error occur for the many other action containers which are routinely removed from the system?
  2. An external agent can overcome the problem but will complicate our container-only principle and make deployment potentially complicated. The agent will itself need to be restarted and coordinated with the invoker as essentially part of the logic is being moved into another component.
  3. If we separate out blackbox containers, it is feasible to avoid logging through docker by having the action container push out its logs by some other means. This is of course a lot more work but might give us much more control and splitting off blackbox means we are committed to multiple implementation strategies.
dfederschmidt commented 8 years ago

I was able to reproduce the issue using the following sequence of actions:

Our issue seems to be related to https://github.com/docker/docker/issues/17823 and https://github.com/docker/docker/issues/17902 where host's /var/lib/docker/containers is also added as a data volume to a container and 1.9 is also used.

Next steps:

csantanapr commented 7 years ago

@domdom82 Do you know if this still is an issue with recent changes and docker 1.12 ?

markusthoemmes commented 7 years ago

@domdom82 is this still applicable?

domdom82 commented 7 years ago

@markusthoemmes nope working now.