Docker Registry caching

hsiliev commented 9 years ago

Caching of images:

who does this
how this is done

https://github.com/pivotal-cf-experimental/diego-dev-notes/blob/master/proposals/docker_registry_caching.md

onsi commented 9 years ago

I have a number of comments/questions on this:

I don't understand Pull Location -> Private Registry. Are you saying that we can run a Private Registry and then instruct the Registry (via an API) to pull, tag, and push a Docker repo (i.e. we avoid having to stream the bits through the builder's container? That sounds pretty good if possible but I want to make sure I understand exactly what you're suggesting.
I don't think we should tag the image with the ProcessGuid. Instead we should generate our own guid and send it back to CC as a URL pointing to the private docker registry. When CC requests that Diego runs the application it would then provide that URL and everything should just work. The issue here is that the ProcessGuid can change without there being a change in the contents of the container (for example, if the user changes the amount of memory required).
If we have multiple Private Registries and we push to just one of them how do we ensure that we can pull from another private registry? Are they all backed by a shared blob-store? Do we need to push to all private registries (bad!).
I don't fully understand what the "Caching Process" is -- isn't this just the private registry? Can't we communicate directly with the private registry? Do we need a Docker daemon in order to communicate with the private registry? If so - we could just run the Docker daemon inside the staging container and just use it to pull, tag, and then push (!). That way we can avoid provisioning Docker daemons, etc.. In any event, I don't want to add anything to Garden-Linux if at all possible.

georgethebeatle commented 9 years ago

Pull Location -> Private Registry means that we can use the docker daemon running in the docker registry job, instructing it to pull, tag and push the image to the registry. To do this we need to either export the daemon's own API or provide a new endpoint that knows how to call it locally.

Unfortunately we did not find a way to instruct the registry itself to pull the image. We have to either use the daemon or mimic what it's doing.
Agree
We did not worry about multiple private registries for MVP0. For later versions I believe we can go with some shared storage (a volume or Riak CS)
The caching process is the one that does pull, tag and push operations to the registry. The registry can't do these by itself, so we either use the daemon or mimic the daemon - these are the two approaches we mention in the proposal.

Running the docker daemon inside the staging container is exactly what we are proposing - by provisioning the daemon we mean somehow bringing it to the staging container. We thought about including it in the docker lifecycle archive, much like the builder.

hsiliev commented 9 years ago

Updated the proposal with info about:

caching on the registry
CC URL -> tag mapping
root or privileged container requirement

onsi commented 9 years ago

Got it - great thanks. I didn't fully appreciate that there are three things we'll need: builder + docker daemon + registry. Makes sense to me now.

onsi commented 9 years ago

This is much clearer now, thanks for updating the proposal. It sounds, to me, like running the Docker daemon in the container and shipping it with the builder is going to be the most expedient way forward. We can run the Docker daemon as root if we need to (we should try running it in a namespaced container (i.e. unprivileged) if possible -- but even that can change). The good news here is we never run untrusted user code during this staging process.

hsiliev commented 9 years ago

We tried running privileged process today, but docker daemon complained about root access. A number of issues in docker's repo suggest this is not possible currently: https://github.com/docker/docker/issues/1034 https://github.com/docker/docker/issues/2919 https://github.com/docker/docker/issues/2918

I think the only option left to have caching as part of the current staging process is to run docker as trusted code within container. This will not limit docker since it will run as root, but can help to enforce disk space quota.

hsiliev commented 9 years ago

Docker daemon requires container root access to mount its graph root. Privileged container is required to access files owned by the real root user which has different user id (uid). The daemon tries to create:

/dev/mapper - if it fails it logs

2015-02-27T12:59:43.06+0200 [STG/0]      ERR time="2015-02-27T10:59:43Z" level="debug" msg="libdevmapper(5): libdm-file.c:27 (0) Creating directory \"/dev/mapper\"" 
2015-02-27T12:59:43.06+0200 [STG/0]      ERR time="2015-02-27T10:59:43Z" level="debug" msg="libdevmapper(3): libdm-file.c:52 (-1) /dev/mapper: mkdir failed: Permission denied" 
2015-02-27T12:59:43.06+0200 [STG/0]      ERR time="2015-02-27T10:59:43Z" level="debug" msg="libdevmapper(6): ioctl/libdm-iface.c:399 (0) <backtrace>" 
2015-02-27T12:59:43.06+0200 [STG/0]      ERR time="2015-02-27T10:59:43Z" level="debug" msg="libdevmapper(3): ioctl/libdm-iface.c:415 (-1) Failure to communicate with kernel device-mapper driver." 
2015-02-27T12:59:43.06+0200 [STG/0]      ERR time="2015-02-27T10:59:43Z" level="debug" msg="libdevmapper(3): ioctl/libdm-iface.c:417 (-1) Check that device-mapper is available in the kernel." 
2015-02-27T12:59:43.06+0200 [STG/0]      ERR time="2015-02-27T10:59:43Z" level="debug" msg="libdevmapper(6): ioctl/libdm-iface.c:1849 (0) <backtrace>" 
2015-02-27T12:59:43.06+0200 [STG/0]      ERR time="2015-02-27T10:59:43Z" level="debug" msg="libdevmapper(6): ioctl/libdm-iface.c:506 (0) <backtrace>" 
2015-02-27T12:59:43.06+0200 [STG/0]      ERR time="2015-02-27T10:59:43Z" level="debug" msg="libdevmapper(6): ioctl/libdm-iface.c:531 (0) <backtrace>" 
2015-02-27T12:59:43.06+0200 [STG/0]      ERR time="2015-02-27T10:59:43Z" level="debug" msg="libdevmapper(6): libdm-common.c:257 (0) <backtrace>"

/etc/docker - fatal message 2015-02-27T12:59:43.06+0200 [STG/0] ERR time="2015-02-27T10:59:43Z" level="fatal" msg="mkdir /etc/docker: permission denied" is logged

IP tables modification also require privileged container. This however should not be an issue with caching and currently we just disable this with --iptables=true switch.

To enable privileged tasks to be executed we should have the executor started with -allowPrivileged=true. Since this is not a good idea with regards to security we might want to use placement pools (once implemented) or try to workaround the /etc/docker creation with setup step before launching docker daemon.

hsiliev commented 9 years ago

Described implementation details about:

Docker image caching:
- ifrit
- daemon & builder processes
Cloud Controller implementation

cloudfoundry / diego-notes

Docker Registry caching #23