docker-library / httpd

Docker Official Image packaging for Apache HTTP Server
https://httpd.apache.org
Apache License 2.0
310 stars 345 forks source link

Save ~30MB and one layer in the Debian image by re-arranging apt packages #167

Closed cedricroijakkers closed 4 years ago

cedricroijakkers commented 4 years ago

The Alpine-based image is pretty much optimised when it comes to space usage. But there is still ~30MB to save in the Debian-based image.

In this pull request, I've split up the download of apt packages. The packages libapr1, libaprutil1, libnghttp2-14, libssl1.1, and libbrotli1 have been added to the list of run-time dependencies, and libapr1-dev and libaprutil1-dev have been moved to the compile-time dependencies, which will be automatically removed with apt-get purge -y --auto-remove at the end of the build.

Also, the steps of adding the run-time dependencies and build-time dependencies have been combined into one RUN command, which saves one layer of the Docker image. Because of this, some environment variabled had to be moved towards the top of the Dockerfile.

These two steps save ~30MB of the Docker image. Current image in the registry once pulled and decompressed:

httpd                    2.4.46              6d82971d37d0        30 hours ago        166MB

Image after the modifications in this pull request:

httpd                    2.4.46              7adbaaacc87b        4 minutes ago       136MB
yosifkit commented 4 years ago

Related to #160 and #163. But I am uncertain if the libapr-1.so and libaprutil-1.so included in libapr1-dev and libaprutil1-dev, respectively, are important for runtime.

I also don't want to break anyone's images that are building their own httpd modules that might expect these headers. In Alpine based images, the target is to be as small as possible, but Debian based images have more flexibility in including useful packages.

cedricroijakkers commented 4 years ago

Well, the libaprutil-1.so and libapr-1.so are very much required by the httpd binary, since they are the Apache Portable Runtime, so without them httpd would not even start, not to mention load any modules. They are not installed with libapr1-dev or libaprutil1-dev but with libapr1 and libaprutil1 (without the -dev). You don't notice this, since e.g. libaprutil1 is automatically installed as a dependecy of libaprutil1-dev. Adding libaprutil1-dev as a build-only dependency causes a problem, since at the end of the build apt-get clean is done, this would remove libaprutil1 since it was an automatically installed dependency, and therefore the package manager thinks it is no longer needed, while it very much is (the same holds for libapr1 of course):

$ docker run -it httpd:latest bash
root@ec5dfbca9bd9:/usr/local/apache2# ldd bin/httpd 
    linux-vdso.so.1 (0x00007ffd44ff0000)
    libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f305532c000)
    libaprutil-1.so.0 => /usr/lib/x86_64-linux-gnu/libaprutil-1.so.0 (0x00007f30552fe000)
    libapr-1.so.0 => /usr/lib/x86_64-linux-gnu/libapr-1.so.0 (0x00007f30552c5000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f30552a4000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f30550e3000)
    libuuid.so.1 => /lib/x86_64-linux-gnu/libuuid.so.1 (0x00007f30550da000)
    librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f30550ce000)
    libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f3055094000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f305508f000)
    libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007f3055052000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f305544a000)

As for the possibility of people using these images as a base image and install their own modules, that is correct. And they will need the -dev packages in order to build their modules inside the container, so I can understand your point. However, they would need to install other apt packages anyway, since gcc, make or any other tool from the build chain is not part of the current Debian-based package (keeping the build chain inside the container will make it quite a lot larger, too):

$ docker run -it httpd:latest sh
# gcc
sh: 1: gcc: not found
# make
sh: 2: make: not found

So if they are doing this, they will need to at least install build-essential, they might as well install libapr1-dev or libaprutil1-dev at the same time.

I actually build a custom image on top of the public httpd image myself, and I need to load some extra packages anyway that are needed to build the modules I'm adding, so I don't think this will be a problem.

yosifkit commented 4 years ago

They are not installed with libapr1-dev or libaprutil1-dev but with libapr1 and libaprutil1

I saw that there were .so files in each of the dev packages; apparently those are just the same symlink as the regular package (why ship it twice :confused: :angry: :-1: :hankey:).


So this change should just be removing the two lines. Moving things around to save the extra layer is unnecessary. libaprutil1-ldap is the only library not automatically kept by the apt-mark / find / ldd machinations.

cedricroijakkers commented 4 years ago

I've updated the pull request according to your comments. Simply removing the two lines with the -dev packages breaks the build, as they will not be downloaded at compilation time, so I have removed them from the run-time packages and added them to the build-time packages to make the build succeed. The apt-mark / find / ldd machinery indeed makes sure that libapr1 and libaprutil1 are not removed, so there is no need to install them explicitly.

I've undone the shuffling of the commands to save a layer as requested.

cedricroijakkers commented 4 years ago

Also re-added the empty newline at the end of the Dockerfile, so that the diff between the source and this PR is only the removal and addition of the apt packages.