uber-archive / makisu

Fast and flexible Docker image building tool, works in unprivileged containerized environments like Mesos and Kubernetes.
Apache License 2.0
2.41k stars 154 forks source link

With some special base images, image size drastically larger than when built using docker #243

Open cyrus-mc opened 5 years ago

cyrus-mc commented 5 years ago

New to makisu and built my first image. After successful build I find that the image is (as reported by docker images) 1.49G. That same image built with docker results in an image size of 811M.

FROM runatlantis/atlantis:v0.8.2

# install terragrunt binary

RUN curl -LOks https://github.com/gruntwork-io/terragrunt/releases/download/v${TERRAGRUNT_VERSION}/terragrunt_linux_amd64  && \
      mv terragrunt_linux_amd64 /usr/local/bin/terragrunt && \
      chmod +x /usr/local/bin/terragrunt && ln -s /usr/local/bin/terragrunt /usr/local/bin/terragrunt0.18

RUN curl -LOks https://github.com/gruntwork-io/terragrunt/releases/download/v${TERRAGRUNT_VERSION}/terragrunt_linux_amd64  && \
      mv terragrunt_linux_amd64 /usr/local/bin/terragrunt0.19 && \
      chmod +x /usr/local/bin/terragrunt0.19

# Versions: https://pypi.python.org/pypi/awscli#downloads
ENV AWS_CLI_VERSION 1.16.157

RUN apk update && \
    apk add python \
            py-pip py-setuptools \
            ca-certificates \
            ansible \
            jq && \
    pip --no-cache-dir install awscli==${AWS_CLI_VERSION} && \
    rm -rf /var/cache/apk/*

COPY config/aws /home/atlantis/.aws

# copy our custom plugins/providers
#COPY plugins /home/atlantis/.terraform.d/plugins

# copy updated providers
COPY bin /usr/local/bin/

# link terraform to version 0.11.14
RUN rm -f /usr/local/bin/terraform && \
    ln -s /usr/local/bin/terraform0.11.14 /usr/local/bin/terraform

Inspecting the resulting image and comparing that to the one that docker build produces shows different layers (both in hash and number).

Inspect of makisu image

        "RootFS": {
            "Type": "layers",
            "Layers": [
                "sha256:bcf2f368fe234217249e00ad9d762d8f1a3156d60c442ed92079fa5b120634a1",
                "sha256:32717ade1adbfd23bf4489ac1fdb862888c916eaa787322419ce87e95ed83bbf",
                "sha256:4851334180e692b67a53046bfbb1337d9e899d151f0476fbd9cdb21316fc455d",
                "sha256:f82e81ba32a7b279efcc14baf720a5a4f41e21faf5543d53ca378c1538f10b6e",
                "sha256:a723cda4dfd776bda07ccc96cca49b0bef5647586fc3c161393f3181ed3bc359",
                "sha256:0c3fb05691d830512a35ba4fec5bc96bcd104b48dc3fb649a5e6a4379b660698",
                "sha256:9b6cb2a46269f90fce16234610a28f2dee5a30f6050b0972057e29568d95f098",
                "sha256:321c00e3b0abba69b12da8e73e427e01a371c725d64eec2ffa9827dfd0c5ad85"
            ]
        }

Inspect of docker image

        "RootFS": {
            "Type": "layers",
            "Layers": [
                "sha256:bcf2f368fe234217249e00ad9d762d8f1a3156d60c442ed92079fa5b120634a1",
                "sha256:32717ade1adbfd23bf4489ac1fdb862888c916eaa787322419ce87e95ed83bbf",
                "sha256:4851334180e692b67a53046bfbb1337d9e899d151f0476fbd9cdb21316fc455d",
                "sha256:f82e81ba32a7b279efcc14baf720a5a4f41e21faf5543d53ca378c1538f10b6e",
                "sha256:a723cda4dfd776bda07ccc96cca49b0bef5647586fc3c161393f3181ed3bc359",
                "sha256:0c3fb05691d830512a35ba4fec5bc96bcd104b48dc3fb649a5e6a4379b660698",
                "sha256:9b6cb2a46269f90fce16234610a28f2dee5a30f6050b0972057e29568d95f098",
                "sha256:4450521046838f5252bc6115485283c2796ac07000be72b38857103e209263a0",
                "sha256:9ed78fb9f246e4663814e31971db3f3968cfc37dabf2f02f99d2d6b4ff6a9d74",
                "sha256:90234a343ee2efd14fa48ac52b3aa79e0a38456d3e3ee575842c34c034a30ca9",
                "sha256:3bb3295ac4b5f5eb1adc809ebb668ab9460d57bb51a5802e1ab1120fe7263d62",
                "sha256:354c38982758323f3785a541d9f957279c4454ad67a0670ccfc412ea68ca60d5",
                "sha256:27fdcb5138781980d388d3be3363cd8964901f2e855e57abeedb3c9cb4a4ed64"
            ]
        }

From the looks of it the the first 7 layers are common (and are actually from the base image runatlantis/atlantis:v0.8.2). In makisu it appears that all of the layers generated from my Dockerfile are squashed into one (which would make caching difficult no?) whereas in the docker build version they are not.

yiranwang52 commented 5 years ago

Thanks for the analysis. We found this issue in our environment recently as well, and fixed that in #239. We will create a new release soon.

cyrus-mc commented 5 years ago

I built master (v0.1.11-10-g1b1102a) and re-ran. Image size is better as the resulting size is only 1.06G now (vs 1.54G before). However docker build still results in a 811M file.

The image layers still show as indicated above. No difference.

yiranwang52 commented 5 years ago

Did you set commit=explicit? if you did, then it's expected you get less layers, see https://github.com/uber/makisu#explicit-commit-and-cache

About the size, I need a little time to debug.

cyrus-mc commented 5 years ago

Here is some more info

makisu image:

› docker history my_image
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
174570a58198        2 minutes ago       makisu: RUN rm -f /usr/local/bin/terraform &…   111MB
<missing>           2 minutes ago       makisu: COPY bin /usr/local/bin/  (c0a9630e)    111MB
<missing>           2 minutes ago       makisu: COPY config/aws /home/atlantis/.aws …   29B
<missing>           2 minutes ago       makisu: RUN apk update &&     apk add python…   217MB
<missing>           3 minutes ago       makisu: RUN curl -LOks https://github.com/gr…   9B
<missing>           3 minutes ago       makisu: RUN curl -LOks https://github.com/gr…   294MB

docker image:

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
13ecb764a55b        7 days ago          /bin/sh -c rm -f /usr/local/bin/terraform &&…   31B
507e0de60879        7 days ago          /bin/sh -c #(nop) COPY dir:f139a048ea560915c…   111MB
639ff8952e0a        7 days ago          /bin/sh -c #(nop) COPY dir:71fc8ec224dc3156c…   29B
7e589924db1c        7 days ago          /bin/sh -c apk update &&     apk add python …   217MB
d67b2cbeb26c        7 days ago          /bin/sh -c #(nop)  ENV AWS_CLI_VERSION=1.16.…   0B
13c98269605f        7 days ago          /bin/sh -c curl -LOks https://github.com/gru…   28.2MB
216f6f7e82cf        7 days ago          /bin/sh -c curl -LOks https://github.com/gru…   18.6MB
cyrus-mc commented 5 years ago

Interesting is the first layer listed there, in the Dockerfile that simply removes a file and sets a symlink. Docker reports that as a 31B layer whereas makisu matches the previous layer and makes a 111Mb layer.

In addition the curl commands that are used to install terragrunt seem off in the makisu version. First one is 294M. While the second layer is only 9B.

yiranwang52 commented 5 years ago

I think it's caused by changes we made to https://github.com/uber/makisu/blob/master/lib/tario/compare.go

yiranwang52 commented 5 years ago

I did more experiments, and found out this only happens for some special base images like runatlantis/atlantis. Other base images like debian:9 doesn't have this issue.

The reason is - makisu includes all files that's considered different from previous layers into current layer. In this particular case with runatlantis/atlantis, most files are considered different because this condition is hit: https://github.com/uber/makisu/blob/5a0da448b6315f098e0cb26823c6986c202b8ef7/lib/tario/compare.go#L25

Because these base image layer tars somehow has these files as of TypeRegA, while after running build steps Makisu thinks they should be of type TypeReg. Technically they should have been the same thing, and TypeRegA is a deprecated type, so I am not sure how the base image was built to produce such a result - maybe it was built on a old version of docker or some special filesystem?

Regardless, this is still a minor bug we should fix.

yiranwang52 commented 5 years ago

I think there is also a size problem with hardlink.