goldmann / docker-squash

Docker image squashing tool
MIT License
848 stars 109 forks source link

Squashing from layer n to m? #165

Open moritzp opened 6 years ago

moritzp commented 6 years ago

Hello, I recently saw some projects using relatively large base images, such as wnameless/oracle-xe-11g, adding additional layers, e.g. by inserting test-data or some specific configuration. As they are frequently used to run tests in CI environments or as local development environments reflecting a production-closer setup, they are committed to some registry. In these cases using docker-squash compresses everything nicely into one layer but creates one huge layer that needs to be pushed to and pulled from the registry, even if only a small configuration was changed in layer X (X>1). In situations like this, it could make sense to squash only from layer n to m and thereby avoid the download of a layer containing the base image over and over again, even though that one didn't change. Cheers, Moritz

richard-scott commented 6 years ago

Isn't that what the "--from-layer" arg is for?

moritzp commented 6 years ago

No. The --from-layer argument defines from which layer you squash down into the very first one, you can also see that in the example. My point is the following (using the data from the example):

$ docker history jboss/wildfly:latest
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
25954e6d2300        3 weeks ago         /bin/sh -c #(nop) CMD ["/opt/jboss/wildfly/bi   0 B
5ae69cb454a5        3 weeks ago         /bin/sh -c #(nop) EXPOSE 8080/tcp               0 B
dc24712f35c4        3 weeks ago         /bin/sh -c #(nop) ENV LAUNCH_JBOSS_IN_BACKGRO   0 B
d929129d4c8e        3 weeks ago         /bin/sh -c cd $HOME     && curl -O https://do   160.8 MB
b8fa3caf7d6d        3 weeks ago         /bin/sh -c #(nop) ENV JBOSS_HOME=/opt/jboss/w   0 B
38b8f85e74bf        3 weeks ago         /bin/sh -c #(nop) ENV WILDFLY_SHA1=c0dd7552c5   0 B
ae79b646b9a9        3 weeks ago         /bin/sh -c #(nop) ENV WILDFLY_VERSION=10.0.0.   0 B
2b4606dc9dc7        3 weeks ago         /bin/sh -c #(nop) ENV JAVA_HOME=/usr/lib/jvm/   0 B
118fa9e33576        3 weeks ago         /bin/sh -c #(nop) USER [jboss]                  0 B
5f7e8f36c3bb        3 weeks ago         /bin/sh -c yum -y install java-1.8.0-openjdk-   197.4 MB
3d4d0228f161        3 weeks ago         /bin/sh -c #(nop) USER [root]                   0 B
f7ab4ea19708        3 weeks ago         /bin/sh -c #(nop) MAINTAINER Marek Goldmann <   0 B

And now saying I want to squash from 5ae69cb454a5 to 118fa9e33576:

$ docker-squash --from-layer 5ae69cb454a5 --to-layer 118fa9e33576 -t jboss/wildfly:squashed jboss/wildfly:latest

The result would be something like the following:

$ docker history jboss/wildfly:squashed
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
25954e6d2300        3 weeks ago         /bin/sh -c #(nop) CMD ["/opt/jboss/wildfly/bi   0 B
fde7edd2e568        32 seconds ago                                                      358.2 MB
5f7e8f36c3bb        3 weeks ago         /bin/sh -c yum -y install java-1.8.0-openjdk-   197.4 MB
3d4d0228f161        3 weeks ago         /bin/sh -c #(nop) USER [root]                   0 B
f7ab4ea19708        3 weeks ago         /bin/sh -c #(nop) MAINTAINER Marek Goldmann <   0 B

As you can see, the very first layer 25954e6d2300 is kept, then some layers are squashed into fde7edd2e568, while all layers afterwards are kept as well.

goldmann commented 3 years ago

This is the type of feature it was not covered originally. The idea was that images are structured by Dockerfile and you squash content in a particular Dockerfile. If there is need for a behavior as described - I'm happy to review the contribution!