Support for caching Docker build data on VM host

Yserz commented 9 years ago

I'm building my Docker containers on my VM host and observed that the docker build data is not cached by the vagrant-cachier plugin. I would really like to see such a feature in the future if this is possible somehow :)

Here's my shortened Vagrantfile of my virtual host machine:

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  if Vagrant.has_plugin?('vagrant-cachier')
    config.cache.scope = :box
  else
    puts 'WARN:  Vagrant-cachier plugin not detected. Continuing unoptimized.'
  end
  ...
  config.vm.define "dev-host" do |host|
    host.vm.hostname = "dev-host"
    host.vm.box = "chef/centos-6.6"
    host.vm.provision "docker" do |d|
      d.build_image "/vagrant/image", args: "-t name/image"
    end
    ...
  end
  ...
end

And the Vagrantfile of a Docker container:

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  if Vagrant.has_plugin?('vagrant-cachier')
    puts 'INFO:  Vagrant-cachier plugin detected. Optimizing caches.'
    config.cache.scope = :box
  else
    puts 'WARN:  Vagrant-cachier plugin not detected. Continuing unoptimized.'
  end
  ...
  config.vm.define 'local-workstation' do |ws|
    ...
    ws.vm.provider "docker" do |d|
      d.image = "name/image"
      d.vagrant_machine = "dev-host"
      d.vagrant_vagrantfile = "../Vagrantfile"
      d.has_ssh = true
    end
    ...
  end
end

Just to may clearify things an example Output:

==> dev-host: Sending build context to Docker daemon 
DEBUG ssh: stdout: Step 0 : FROM centos:centos7

 INFO interface: info: Step 0 : FROM centos:centos7
 INFO interface: info: ==> dev-host: Step 0 : FROM centos:centos7
==> dev-host: Step 0 : FROM centos:centos7
DEBUG ssh: stdout: centos:centos7: The image you are pulling has been verified

 INFO interface: info: centos:centos7: The image you are pulling has been verified
 INFO interface: info: ==> dev-host: centos:centos7: The image you are pulling has been verified
==> dev-host: centos:centos7: The image you are pulling has been verified
DEBUG ssh: Sending SSH keep-alive...
....Downloading....

fgrehm commented 9 years ago

I'm not sure how well Docker's stuff will behave on VBox shared folder. Do you mind giving the generic bucket a spin before we try implementing something on vagran-cachier's core? thanks in advance!

Yserz commented 9 years ago

I naively tested to set the /var/lib/docker folder as synced folder and failed totally. :D Its eating up my hard drive space on Mac OS X rapidly! If I find a bit time i'll test to set the folders in /var/lib/docker one by one as synced folders.

~~PS: Maybe I'm wrong but I think the generic buckets are not helping in this case.~~

EDIT: Just found this page (http://fgrehm.viewdocs.io/vagrant-cachier/development) and got a clue now :D

Yserz commented 9 years ago

Okay, that's the current status:

I tried the generic bucket config.cache.enable :generic, { :cache_dir => "/var/lib/docker/devicemapper/devicemapper"}
Unfortunately this seems not to work as expected since the docker data is kinda own filesystem (http://jpetazzo.github.io/2014/01/29/docker-device-mapper-resize/)
The docker data file reserves 100GB (!!) but not instantly uses the space. Just like a VM Volume. Unfortunately after copying/syncing the file actually uses the space on hard drive. That was the problem with the full hard drive I stated before.
Since I couldn't handle 100GB I tried to reduce the size of the docker data to 20GB with dd if=/dev/zero of=data bs=1g count=0 seek=20
I started the VM and the following error occured:

DEBUG subprocess: Waiting for process to exit. Remaining to timeout: 32000
DEBUG subprocess: Exit status: 0
DEBUG virtualbox_4_3:   - [1, "ssh", 2222, 22]
DEBUG ssh: Re-using SSH connection.
 INFO ssh: Execute: id -Gn | grep docker (sudo=false)
DEBUG ssh: stdout: vagrant docker

DEBUG ssh: Exit status: 0
DEBUG ssh: Checking whether SSH is ready...
DEBUG ssh: Re-using SSH connection.
 INFO ssh: SSH is ready!
DEBUG ssh: Re-using SSH connection.
 INFO ssh: Execute:  (sudo=false)
DEBUG ssh: Exit status: 0
DEBUG guest: Searching for cap: docker_daemon_running
DEBUG guest: Checking in: redhat
DEBUG guest: Checking in: linux
DEBUG guest: Found cap: docker_daemon_running in linux
 INFO guest: Execute capability: docker_daemon_running [#<Vagrant::Machine: dev-host (VagrantPlugins::ProviderVirtualBox::Provider)>] (redhat)
DEBUG ssh: Re-using SSH connection.
 INFO ssh: Execute: test -f /var/run/docker.pid (sudo=false)
DEBUG ssh: Exit status: 0
DEBUG ssh: Checking whether SSH is ready...
DEBUG ssh: Re-using SSH connection.
 INFO ssh: SSH is ready!
DEBUG ssh: Re-using SSH connection.
 INFO ssh: Execute:  (sudo=false)
DEBUG ssh: Exit status: 0
DEBUG guest: Searching for cap: docker_daemon_running
DEBUG guest: Checking in: redhat
DEBUG guest: Checking in: linux
DEBUG guest: Found cap: docker_daemon_running in linux
 INFO guest: Execute capability: docker_daemon_running [#<Vagrant::Machine: dev-host (VagrantPlugins::ProviderVirtualBox::Provider)>] (redhat)
DEBUG ssh: Re-using SSH connection.
 INFO ssh: Execute: test -f /var/run/docker.pid (sudo=false)
DEBUG ssh: Exit status: 0
 INFO interface: info: Building Docker images...
 INFO interface: info: ==> dev-host: Building Docker images...
==> dev-host: Building Docker images...
 INFO interface: info: -- Path: /vagrant/minivm
 INFO interface: info: ==> dev-host: -- Path: /vagrant/minivm
==> dev-host: -- Path: /vagrant/minivm
DEBUG ssh: Re-using SSH connection.
 INFO ssh: Execute: docker build -t yserz/docker-vagrant-centos-7-minivm /vagrant/minivm (sudo=true)
DEBUG ssh: stderr: Sending build context to Docker daemon 

 INFO interface: info: Sending build context to Docker daemon 
 INFO interface: info: ==> dev-host: Sending build context to Docker daemon 
==> dev-host: Sending build context to Docker daemon 
DEBUG ssh: stderr: 2014/11/27 08:18:55 Cannot connect to the Docker daemon. Is 'docker -d' running on this host?

 INFO interface: info: 2014/11/27 08:18:55 Cannot connect to the Docker daemon. Is 'docker -d' running on this host?
 INFO interface: info: ==> dev-host: 2014/11/27 08:18:55 Cannot connect to the Docker daemon. Is 'docker -d' running on this host?
==> dev-host: 2014/11/27 08:18:55 Cannot connect to the Docker daemon. Is 'docker -d' running on this host?
DEBUG ssh: Exit status: 1
ERROR warden: Error occurred: The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

docker build -t yserz/docker-vagrant-centos-7-minivm /vagrant/minivm

Stdout from the command:

Stderr from the command:

Sending build context to Docker daemon 
2014/11/27 08:18:55 Cannot connect to the Docker daemon. Is 'docker -d' running on this host?

So docker is not starting correctly anymore :/

Any thoughts? Permission problem?

Yserz commented 9 years ago

Okay, here's my bashy solution.

Basically this approach uses docker save and docker load to save/load images to/from a shared folder.

    # ensure docker is installed because we are using docker commands
    host.vm.provision "docker"
    # load images from the cache
    host.vm.provision "shell", path: "scripts/docker/docker_load_images"
    # build your images. Docker will skip all existent images.
    host.vm.provision "docker" do |d|
      d.build_image "/data", args: "-t name/image"
      d.version = :latest
    end
    # save all built images into cache
    host.vm.provision "shell", path: "scripts/docker/docker_save_images"

It's possible to save images on vagrant destroy if you are working with docker on the vm:

    if Vagrant.has_plugin?('vagrant-triggers')
      config.trigger.before :destroy, :stdout => true, :force => true do
        info "Updating Docker cache..."
        run_remote "sudo chmod 755 /vagrant/docker_save_images && /vagrant/docker_save_images"
      end
    end

Here's the script to save docker images:

CACHE_DIR=/tmp/vagrant-cache/docker/images/
echo "cache dir=$CACHE_DIR"
[ -d $CACHE_DIR ] || mkdir -p $CACHE_DIR
echo "#####################"

IMAGES=$(docker images -q)
echo -e "All Images: \n$IMAGES"
IMAGES_ARRAY=(${IMAGES//$'\n'/ })

FILES=$(find $CACHE_DIR -maxdepth 1 -type f -name '*.tar')
echo -e "All Files: \n$FILES"
FILES_ARRAY=(${FILES//$'\n'/ })

for IMAGE in "${IMAGES_ARRAY[@]}"
do
    SKIP=1
    for FILE in "${FILES_ARRAY[@]}"
    do
        if [[ "$(basename "$FILE")" == "$IMAGE.tar" ]]; then
            echo "Skipping $IMAGE"
            SKIP=0
        fi
    done
    if [[ $SKIP -eq 1 ]]; then
        echo "Saving image $IMAGE into cache..."
        docker save -o $CACHE_DIR$IMAGE.tar $IMAGE
    fi
done

And here's the script for loading docker images:

CACHE_DIR=/tmp/vagrant-cache/docker/images/
echo "cache dir=$CACHE_DIR"
[ -d $CACHE_DIR ] || mkdir -p $CACHE_DIR
echo "#####################"

IMAGES=$(docker images -q)
echo -e "All Images: \n$IMAGES"
IMAGES_ARRAY=(${IMAGES//$'\n'/ })

FILES=$(find $CACHE_DIR -maxdepth 1 -type f -name '*.tar')
echo -e "All Files: \n$FILES"
FILES_ARRAY=(${FILES//$'\n'/ })

for FILE in "${FILES_ARRAY[@]}"
do
    SKIP=1
    for IMAGE in "${IMAGES_ARRAY[@]}"
    do
        if [[ "$(basename "$FILE")" == "$IMAGE.tar" ]]; then
            echo "Skipping $IMAGE"
            SKIP=0
        fi
    done
    if [[ $SKIP -eq 1 ]]; then
        echo "Loading image $FILE from cache..."
        docker load -i $FILE
    fi
done

This solution is a bit slow since we can't hook into the docker build process and have to save the images with an extra write operation per image (docker save <image-id>). This operation has to copy a whole image which can easily be around 300 MB. In addition we have to load that 300 MB image from disk when we load the cached image. In summary this approach is sufficient if you are working with docker images with a lot of commands or with docker images which have to be downloaded over the net anyway. What do you think about this solution? Is it a good point to start with and apply further improvements?

fgrehm commented 9 years ago

I guess that this is too much for us to handle from this plugin. I wonder if it would make sense to have that as a separate plugin that builds on top of cachier and trigger plugins... Going to think a bit more on it.

Thanks for all the info so far!

ColCh commented 9 years ago

@Yserz :+1: It's working flawlessy on my Windows host @fgrehm main problem is catch docker pull executing

ashb commented 9 years ago

I'm not quite sure how we'd tie it in to cachier but the registry image can actually be used as a caching-mirror:

https://github.com/docker/docker/blob/master/docs/sources/articles/registry_mirror.md:

How does it work?

The first time you request an image from your local registry mirror, it pulls the image from the public Docker registry and stores it locally before handing it back to you. On subsequent requests, the local registry mirror is able to serve the image from its own storage.

So maybe the plugin should just configure this registry image to save the images to a cached disk, and then optionally try to configure the docker daemon in the VM?

Thoughts?

Yserz commented 9 years ago

It actually already exports every image (downloaded and locally created) on the machine to a cached disk.

CharlieC3 commented 9 years ago

I'm in a similar situation; we're looking to use Vagrant and Docker to create a local development solution. We need to cache Docker images to reduce the time it takes to run vagrant destroy; vagrant up on a project.

:+1: for Docker support in Cachier.

asmaier commented 9 years ago

Another :+1: for docker support in vagrant-cachier

majidaldo commented 9 years ago

+1

LiberQuack commented 9 years ago

Hey, based on @Yserz I created a Vagrantfile to cache docker images... now I can use Fig and Docker for having a full development enviroment. Even on Windows

Vagrant.configure(2) do |config|

  config.vm.box = "ubuntu/trusty64"
  config.vm.hostname = 'project'
  config.vm.network "forwarded_port", guest:1521, host:1521
  config.cache.enable :generic, {"docker" => { cache_dir: "/cache-docker" }}

  config.vm.provision "shell", inline: <<-SHELL
    #installs docker
    wget -qO- https://get.docker.io/ubuntu/ | sudo bash

    #installs fig
    wget -qO- https://github.com/docker/fig/releases/download/1.0.1/fig-`uname -s`-`uname -m` > /usr/local/bin/fig
    chmod +x /usr/local/bin/fig

    #load cached images
    docker load -i /cache-docker/*.tar

    #download your images and start containers
    fig -f /vagrant/fig.yml up

    #cache your images
    IMAGES=`docker images | awk '{print $1}' | tail -n +2`
    docker save -o /cache-docker/images.tar $(echo ${IMAGES[@]})
  SHELL

  config.vm.provision "shell", run: "always", inline: "fig -f /vagrant/fig.yml up"

end

My workflow basically involves setting up containers (db, mq, whatever...) and then develop on host machine... please tell me your workflows

PS: Java programmer

leighmcculloch commented 9 years ago

@MartinsThiago your save and load works good, I'm using it, but it error'd on images with the name <none>. I've altered the save command to exclude images with the name <none> since they're old and I likely don't need them.

IMAGES=`docker images | awk '{print $1}' | grep -v '<none>' | tail -n +2`
docker save -o /docker-cache/images.tar $(echo ${IMAGES[@]})

LiberQuack commented 9 years ago

Nice @leighmcculloch, I actually dropped its usage for now :(... I tried to work with some usual containers like postgres and mongo... but there's too much data to be downloaded :(

Something around 3GB... waiting for the day we could use a git aproach like git clone --depth 1

PS: I changed fig commands to: 1st -> fig -f /vagrant/fig.yml pull 2st -> config.vm.provision "shell", run: "always", inline: "cd /vagrant; fig up -d --no-recreate"

Sooo, by adding @leighmcculloch's fix we got -->

Vagrant.configure(2) do |config|

  config.vm.box = "ubuntu/trusty64"
  config.vm.hostname = 'project_short_name'
  config.vm.network "forwarded_port", guest:8080, host:8080
  config.cache.enable :generic, { "docker" => { cache_dir: "/cache-docker" } }
  config.cache.scope = :box

  config.vm.provision "shell", inline: <<-SHELL
    #installs docker
    wget -qO- https://get.docker.io/ubuntu/ | sudo bash

    #installs fig
    wget -qO- https://github.com/docker/fig/releases/download/1.0.1/fig-`uname -s`-`uname -m` > /usr/local/bin/fig
    chmod +x /usr/local/bin/fig

    #load cached images
    docker load -i /cache-docker/*.tar

    #run your commands
    fig -f /vagrant/fig.yml pull

    #cache your images
    IMAGES=`docker images | awk '{print $1}' | grep -v '<none>' | tail -n +2`
    docker save -o /cache-docker/images.tar $(echo ${IMAGES[@]})
  SHELL

  config.vm.provision "shell", run: "always", inline: "cd /vagrant; fig up -d --no-recreate"

end

amcorreia commented 9 years ago

Maybe an option for docker is just use another container with apt-cacher like this https://registry.hub.docker.com/u/sameersbn/apt-cacher-ng/

patcon commented 9 years ago

cc: @mrjcleaver

rmohr commented 7 years ago

A nice solution @fabiand came up with in our use case, is to use an extra disk as docker storage location. He put this in our deployment script:

 # if there is a second disk, use it for docker
 if ls /dev/vdb ; then
 # We use the loopback docker dm support, and not a VG for now
   mkdir -p /var/lib/docker/
   restorecon -r /var/lib/docker
   mount LABEL=dockerdata /var/lib/docker/ || {
     mkfs.xfs -L dockerdata -f /dev/vdb
   }
mkdir -p /etc/systemd/system/docker.service.d/
cat > /etc/systemd/system/docker.service.d/mount.conf <<EOT
 [Service]
 ExecStartPre=/usr/bin/sleep 5
 ExecStartPre=-/usr/bin/mount LABEL=dockerdata /var/lib/docker
 MountFlags=shared
 EOT
   mount LABEL=dockerdata /var/lib/docker/
 fi

Then I just had to add an additional disk to the Vagrant VM which would survive vagrant destroy.

# for vagrant-libvirt provider
 if $cache_docker then
     domain.storage :file, :size => '10G', :path => 'master_docker.img', :allow_existing => true, :device => vdb
end

fgrehm commented 1 year ago

Hey, sorry for the silence here but this project is looking for maintainers :sweat_smile:

As per https://github.com/fgrehm/vagrant-cachier/issues/193, I've added the ignored label and will close this issue. Thanks for the interest in the project and LMK if you want to step up and take ownership of this project on that other issue :wave:

fgrehm / vagrant-cachier

Support for caching Docker build data on VM host #131

How does it work?