moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.7k stars 18.66k forks source link

Pull and rebuild only if Dockerfile changed #22435

Closed ghost closed 7 years ago

ghost commented 8 years ago

Is it possible to pull image generated from given Dockerfile from a registry and rebuild it if and only if Dockerfile changed?

I tried to do just

docker pull mytag && docker build -t mytag .

but it doesn't work, docker tries to rebuild the image. Is it possible to do what I described in docker? If not I think that this feature would be very handy.

thaJeztah commented 8 years ago

No, this won't work. When pulling an image from a registry, you don't pull the build cache that docker uses to speed up builds. Determining if an image(layer) needs to be rebuilt depends on both the Dockerfile, and the build context - the files used to build the image - so Docker needs access to both, to determine if the cache can be used.

The moment you run docker build, the Dockerfile and the build context is sent to the daemon-side builder, and if no rebuild is required for a later, and it's available in the local cache (I.e., you built that layer before on that machine) you'll see "using cache" in the output.

Is there a reason you're pulling the image first, before building it?

ghost commented 8 years ago

@thaJeztah thanks for good explanation!

I have a repository with Jupyter notebooks that require lots of custom libraries. I've created a Dockerfile for building a container with required setup of Jupyer and libraries and set up a run.sh script that executes docker build && docker run, so everyone could run by notebook just by executing this script.

I this setup because each modification of Dockerfile or it's context automatically rebuilds the container on next execution of run.sh, so I don't have to worry about manual rebuilding of the image. It simplifies development.

The only problem is that it takes a very long time to run docker build on a new machine, so if someone else tries to see my notebooks using this script it would take a long time. So I want to employ dockerhub to allow others run notebooks instantly, but don't want to lose guarantees that run.sh always runs the actual version of container based on current state of Dockerfile and context.

thaJeztah commented 8 years ago

Depending on your exact situation, a good solution could be to split your image into two parts; a "base" image, that contains the parts of your image that don't change much, and an image that contains the code that's dependent on the local changes. For example

(Dockerfile.base)

FROM ubuntu:14.04
RUN apt-get update && apt-get install -y \
  package-a \
  package-b \
  package-c

COPY some-files
RUN some-stuff

and a Dockerfile that needs to be rebuild on local changes;

FROM my-base-image:latest
ADD local-files /somewhere
RUN steps-to-compile/do-whatever

The first image would be built with docker build -t my-base-image:latest -f Dockerfile.base ., and can be pulled from Docker Hub (or your local registry).

The second Dockerfile is what users have to build when they change code.

LK4D4 commented 7 years ago

Base image is good solution IMO. Closing as inactive.