caddyserver / caddy-docker

Source for the official Caddy v2 Docker Image
https://hub.docker.com/_/caddy
Apache License 2.0
408 stars 74 forks source link

Caddy builder does not build smaller images as described in the documentation #217

Closed idontusenumbers closed 2 years ago

idontusenumbers commented 2 years ago

The documentation at https://hub.docker.com/_/caddy indicates:

Note the second FROM instruction - this produces a much smaller image by simply overlaying the newly-built binary on top of the the regular caddy image.

This does not appear to be the case.

When I follow the instructions with this dockerfile:

FROM caddy:2.4.6-builder AS builder

RUN xcaddy build \
    --with github.com/gamalan/caddy-tlsredis

FROM caddy:2.4.6-alpine

RUN apk add --no-cache \
     curl \
  && rm -rf /tmp/* \
  && rm -rf /var/cache/apk/*

COPY --from=builder /usr/bin/caddy /usr/bin/caddy

COPY ./build/configs/bootstrap.sh /bootstrap.sh
CMD sh /bootstrap.sh

I get the following output from docker image history 0097cbf7c649:

IMAGE          CREATED        CREATED BY                                      SIZE      COMMENT
0097cbf7c649   8 weeks ago    CMD ["/bin/sh" "-c" "sh /bootstrap.sh"]         0B        buildkit.dockerfile.v0
<missing>      8 weeks ago    COPY ./build/configs/bootstrap.sh /bootstrap…   4.53kB    buildkit.dockerfile.v0
<missing>      8 weeks ago    COPY /usr/bin/caddy /usr/bin/caddy # buildkit   32.9MB    buildkit.dockerfile.v0
<missing>      8 weeks ago    RUN /bin/sh -c apk add --no-cache      ca-ce…   1.62MB    buildkit.dockerfile.v0
...
<missing>      4 months ago   /bin/sh -c [ ! -e /etc/nsswitch.conf ] && ec…   17B
<missing>      4 months ago   /bin/sh -c set -eux;  apkArch="$(apk --print…   31.1MB
<missing>      4 months ago   /bin/sh -c #(nop)  ENV CADDY_VERSION=v2.4.3     0B
<missing>      4 months ago   /bin/sh -c set -eux;  mkdir -p   /config/cad…   13kB
<missing>      4 months ago   /bin/sh -c apk add --no-cache ca-certificate…   580kB
<missing>      4 months ago   /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B
<missing>      4 months ago   /bin/sh -c #(nop) ADD file:924de68748d5d7107…   5.36MB

The final image is 71.6MB, so almost entirely the two caddy binaries. Although the final image contains a single accessible binary, the layers combined contain two.

You can see two copies of the caddy binary added (the set -eux; apkArch line toward the bottom and the earlier line COPY /usr/bin/caddy.

I rebuilt the dockerfile by using the contents of the dockerfile in this repo to built into but removed the caddy binary it installs, and it's now 42.8MB (down from 71.6MB),

It might be more valuable to produce a plain image without the actual caddy binary for use in builder scenarios. It would cut the final image in half.

francislavoie commented 2 years ago

The size comparison is with caddy:2.4.6-builder which includes the whole Go build toolchain.

But that's a good point, we could probably produce an image without the binary, that just has the support files.

hairyhenderson commented 2 years ago

It might be more valuable to produce a plain image without the actual caddy binary for use in builder scenarios. It would cut the final image in half.

But that's a good point, we could probably produce an image without the binary, that just has the support files.

That'd basically just be an alpine image with ca-certificates installed... I'm not sure it's worthwhile to produce that as a separate image (and definitely doesn't belong in the official images repo).

One approach is to build with --squash, which will squash it all down to a single layer, without keeping the overwritten binary - that'll make it smaller.

Perhaps the best approach here is to update the docs to refer to --squash, and clarify the "smaller" comment...

idontusenumbers commented 2 years ago

But that's a good point, we could probably produce an image without the binary, that just has the support files.

I don't believe that the builder is included in the final total for the resulting image as evidenced by the 30MB smaller image despite both using the builder.

That'd basically just be an alpine image with ca-certificates installed...

It's close. I think there are few more bits in there that, although don't meaningfully increase the size the image, do set things up in a way caddy expects.

RUN apk add --no-cache \
    ca-certificates mailcap \
  && rm -rf /tmp/* \
  && rm -rf /var/cache/apk/*

RUN set -eux; \
    mkdir -p \
        /config/caddy \
        /data/caddy \
        /etc/caddy \
        /usr/share/caddy \
    ;

# https://github.com/caddyserver/caddy/releases
ENV CADDY_VERSION v2.4.6

# set up nsswitch.conf for Go's "netgo" implementation
# - https://github.com/docker-library/golang/blob/1eb096131592bcbc90aa3b97471811c798a93573/1.14/alpine3.12/Dockerfile#L9
RUN [ ! -e /etc/nsswitch.conf ] && echo 'hosts: files dns' > /etc/nsswitch.conf

# See https://caddyserver.com/docs/conventions#file-locations for details
ENV XDG_CONFIG_HOME /config
ENV XDG_DATA_HOME /data

VOLUME /config
VOLUME /data

EXPOSE 80
EXPOSE 443
EXPOSE 2019

WORKDIR /srv

I haven't attempted to eliminate any of these steps to see what happens.

francislavoie commented 2 years ago

I don't believe that the builder is included in the final total for the resulting image as evidenced by the 30MB smaller image despite both using the builder.

Yes, of course not, because of multi stage builds. What I'm saying is that the reason the builder image exists is because the Go build toolchain is quite big, so if we tried to provide an image that you use to both build and run Caddy, then it would be pretty big.

FWIW I still think it would probably make sense to have a caddy:base image which is basically what you quoted @idontusenumbers, basically everything except the downloading of the Caddy binary from github. Then the actual final caddy:2.5.0 image could be that base plus the downloaded binary.

I'm not sure if Docker official images allow for those kinds of self-dependent images though (i.e. one tag in the same repo depending on another in the same repo). We'd have to ask the Docker Official Images team if that's something they allow or not, because any changes we make needs their approval.

hairyhenderson commented 2 years ago

At the risk of repeating myself, I still think the best way of saving space on a custom-built Caddy image (or, almost any Docker image, for that matter) is to build with --squash.

@idontusenumbers I'd be interested to hear if you've tried squashing it, and what kind of space savings you see with that.

mholt commented 2 years ago

(Pardon my chuckling at the irony of @idontusenumbers being asked to use numbers to calculate space savings :laughing: )

idontusenumbers commented 2 years ago

(Pardon my chuckling at the irony of @idontusenumbers being asked to use numbers to calculate space savings 😆 )

Different numbers; it's an old name from the AIM and ICQ days; I preferred ICQ because I could change my name at will and there was no worry of a conflict with another user; all my friends were on AIM with numbers appended to their name to avoid the collisions, so to make fun of AIM I used idontusenumbers ;)

Adding a squash step to the build would work for caddy but it would be harmful for many of the other images as I would lose the savings of pushing up only the last few layers of an image that was largely shared with other images. I would prefer not to special case images that would benefit from the squash.

From what I can tell, the caddy dockerfiles are generated from a template. This would let you share code for the base image for a builder target with the non-builder images without relying on docker letting you use sibling images in the FROM directive.

hairyhenderson commented 2 years ago

[...] it would be harmful for many of the other images as I would lose the savings of pushing up only the last few layers of an image that was largely shared with other images.

Only the base alpine is shared, which is ~3-5MBs (off the top of my head), so squashing would get you 90% of the way there with no effort.

I'm open to a separately-templated base image, but it's unlikely that a tag containing no actual Caddy binary would be accepted in the official repo, so it would need to remain only in the unofficial caddy organization. I'm not sure if it's OK to reference non-official images in the official docs, either.

idontusenumbers commented 2 years ago

Only the base alpine is shared, which is ~3-5MBs (off the top of my head), so squashing would get you 90% of the way there with no effort.

I was only suggesting that unless I special-cased caddy, I'd be squashing all images in the project, which wouldn't be a great idea.

hairyhenderson commented 2 years ago

Fair enough - I'm obviously not familiar with your particular build process 😅

idontusenumbers commented 2 years ago

I run docker-compose build of which caddy image is part of, then have a Java program that loads a docker-compose file then simply enumerates the services and pushes them to google container registry.

hairyhenderson commented 2 years ago

Closing as I'm not sure there's anything further to be done here. If you believe there's something concrete to be done, feel free to re-open!

idontusenumbers commented 2 years ago

Concretely, I believe the builder process should base off a caddy-less base image so the final image doesn't contain two copies of caddy. The suggestion to use squash, though admittedly will reduce the final image size, will reduce the benefit of layering.