moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.82k stars 18.67k forks source link

Ability to filter ADD / COPY during docker build, based on DIFF #13982

Closed yaronr closed 9 years ago

yaronr commented 9 years ago

Hi This follows a short discussion on IRC with @duglin

Goal: Minimize docker images Method: Not copying files that already exist in the base image (or top layer), unless they are different

Use case: We have a base image, 'multicloud/common'. It includes a whole bunch of stuff, including a set of files under /workdir/lib (a bunch of .jar files, to be specific). We use maven to run our builds, and maven copies dependencies to ./lib, for both the 'common' project, and the projects that depend on it. One of these projects, for example, is called 'agent', and is packaged in a docker image called 'multicloud/agent', which is FROM multicloud/common. As mentioned before, when building 'agent', maven copies all of its dependencies (files) to /lib, including all of those that have been packaged under multicloud/common's workdir/lib. Of course, in the dockerfile for 'agent', I do a COPY (or ADD) ./lib /workdir/lib.

99% of the files copied, are exactly the same (CRC-wise, and probably timestamp-wise) as those that are already on the top layer of the docker image file system. However, copy-on-write adds them to a new layer, effectively increasing the docker image size (dramatically, in some cases).

It would be great, if docker build's ADD or COPY command, had something similar to 'cp -u' - or even better - something that would calculate CRC32 of the files and copy them only if changed. IMO this could potentially dramatically decrease image sizes in many other use cases as well.

duglin commented 9 years ago

ping @docker/maintainers would like to hear what other people think about this.

Note that this would not impact upload time of the build context as we'd still need to uploaded everything to the daemon, and would increase build time due to the amount of time it would take to do the diff (although we'd probably catch 95%+ of the delta cases quickly by just checking sizes).

Also, since this is about reducing layers I think we need to consider this within the context of the squash discussions.

my 2 cents: until we allow for more control during docker build for people to better control the number of layers created (e.g. via a Dockerfile squash command, or by allowing multiple Dockerfile commands to be run within a layer), I think we need to look for ways to help people in situations like this.

@yaronr question for you... since this seems to be about the number of layers, are you looking to keep your base image separate from the resulting build image so that it can be shared at deploy/runtime, or would you be happy with just a single image that merged your base image with the output of the maven build? Because, if the latter is ok, then perhaps #13929 might be part of the answer for you. If the former, then I don't think any of the squash variants discussed so far would help you and you would still need a flag on COPY/ADD.

yaronr commented 9 years ago

@duglin I'm interested in keeping the base image separate. The reason is - there are multiple child images that start from this image. Merging kinda misses the whole point in this case. I have 4-5 images that start from this image. If the base image is shared, and only diffs are persisted on the child images, I save 4-5 X (full - diff) per server that runs these 4-5 images.

cpuguy83 commented 9 years ago

I think this makes sense, especially since the focus is on not needlessly copying files that already exist in the fs, rather than anything specifically related to layers.

thaJeztah commented 9 years ago

Interesting proposal. If this can be agreed on, I think this should be used as default, not an "option". +1

duglin commented 9 years ago

If we did add support for --update would it be like cp and just look at timestamps? or would it also take into account the contents? Or just contents ignoring timestamps (like we did for the cache stuff) ?

I can see this one causing lots of discussions.

I'm leaning towards playing it safe and checking both (contents and timestamps (a/c/m-time)).

cpuguy83 commented 9 years ago

I would just use the same comparison as checking the context against the image cache.

duglin commented 9 years ago

so just content and not mtime. Note, that cache would play no role here, just to be clear.

duglin commented 9 years ago

by that I mean, caching would be be used when appropriate, but we can't use the cache as part of this feature.

cpuguy83 commented 9 years ago

I should clarify... if content differs, then check time, if content doesn't differ, don't bother.

duglin commented 9 years ago

@cpuguy83 seems backwards, I would check the quick things first, like timestamps, size, etc.. then if all of those matched I would do the time consuming one, diff the contents.

cpuguy83 commented 9 years ago

Yeah, ok makes sense. I was thinking about it backwards.

thaJeztah commented 9 years ago

I wonder if time should be taken into account. Chances are that with the use-case described here, all files would still be added if the "copy" step changes timestamps;

when building 'agent', maven copies all of its dependencies (files) to /lib

It's an important decision though; should docker preserve timestamps when COPY/ADD-ing files to a container, or is it okay if a file is not updated if the checksum is the same and only the timestamp differs?

duglin commented 9 years ago

I assumed preserving timestamps was a given, and its the "checking timestamps" question that's the harder one.

jlhawn commented 9 years ago

maven copies dependencies to ./lib, for both the 'common' project, and the projects that depend on it. ... when building 'agent', maven copies all of its dependencies (files) to /lib, including all of those that have been packaged under multicloud/common's workdir/lib.

Just so that I understand correctly, the build for multicloud/agent adds files to ./lib that were already present in the multicloud/common base image? In the same location or a different location? In either case, why can't your build script be more knowledgeable about which entries of the context to copy into the container and just not add the ones that you know to already be in the base image?

Perhaps an example Dockerfile would make the problem more clear to me. Also, if the new layers were squashed together then the new layer wouldn't include files that were already present in the same location in the base image rootfs.

duglin commented 9 years ago

Also, if the new layers were squashed together then the new layer wouldn't include files that were already present in the same location in the base image rootfs.

really? If we have /a in a base image, and then in some (or all) of the additional layers a new /a was created, if we then squashed just those new layers (and not the base image), wouldn't the newly squashed layers contain the new /a ?

jlhawn commented 9 years ago

Also, if the new layers were squashed together then the new layer wouldn't include files that were already present in the same location in the base image rootfs.

really? If we have /a in a base image, and then in some (or all) of the additional layers a new /a was created, if we then squashed just those new layers (and not the base image), wouldn't the newly squashed layers contain the new /a ?

I think that if the file content hasn't changed, and the name and metadata haven't either, then it won't be detected as a change. I'll check just to be certain.

duglin commented 9 years ago

ok that may make sense, I thought you meant it wouldn't be there even if some of those things were changed. Phew :-)

stevvooe commented 9 years ago

This sounds like an optimization. This should not be optional. A differential tree, like rsync, would be optimal, but it can cause file io (reads) on either side.

My only suggestion is that this not be implemented with CRC32. The hash should be sha256.

yaronr commented 9 years ago

@jlhawn

Just so that I understand correctly, the build for multicloud/agent adds files to ./lib that were already present in the multicloud/common base image?

Yes. Since multicloud/agent, as well as a number of other projects inherit from multicloud/common, I would like to have a layer that has most of the libraries in it, and then have inheriting projects just add /change what they need.

why can't your build script be more knowledgeable about which entries of the context to copy into the container and just not add the ones that you know to already be in the base image

I'm afraid Maven doesn't know how to look inside docker images and diff. The same goes for many other use cases, btw.

BTW. Just out of curiosity. If I 'touch' a file (just change it's timestamp) in a base image, does it re-copy the file in the created layer? If the file is 100MB, will 'touch' add 100MB to the image?

jessfraz commented 9 years ago

Hello! We are no longer accepting patches to the Dockerfile syntax as you can read about here: https://github.com/docker/docker/blob/master/ROADMAP.md#22-dockerfile-syntax

Mainly:

Allowing the Builder to be implemented as a separate utility consuming the Engine's API will open the door for many possibilities, such as offering alternate syntaxes or DSL for existing languages without cluttering the Engine's codebase

Then from there, patches/features like this can be re-thought. Hope you can understand.

yaronr commented 9 years ago

Oh well.. On Fri, Jul 10, 2015 at 10:04 PM Jessie Frazelle notifications@github.com wrote:

Hello! We are no longer accepting patches to the Dockerfile syntax as you can read about here: https://github.com/docker/docker/blob/master/ROADMAP.md#22-dockerfile-syntax

Mainly:

Allowing the Builder to be implemented as a separate utility consuming the Engine's API will open the door for many possibilities, such as offering alternate syntaxes or DSL for existing languages without cluttering the Engine's codebase

Then from there, patches/features like this can be re-thought. Hope you can understand.

— Reply to this email directly or view it on GitHub https://github.com/docker/docker/issues/13982#issuecomment-120495300.