Open officialpatterson opened 3 years ago
You can hack this using a multi-stage Docker build. It requires manually figuring out what Debian packages to include though. Here is an example I have that installs graphviz and its dependencies:
https://github.com/evanj/pprofweb/blob/aa0a1a2e87be02f1527f334620c4d45f6fb6ffd4/Dockerfile#L3-L7
... It is possible I should add something like this to the examples in distroless, because I've seen this asked a few times.
@evanj Does this also includes the files in var/lib/dpkg/status
or var/lib/dpkg/status.d/
for vulnerability scanner to find which packages are installed in the resulting image?
The approach in the Dockerfile linked above extracts the contents of the listed .deb
packages into the container image using dpkg --extract
. A quick test on a package locally suggests that it does not include a status file. A bit of shell scripting could probably fix that? E.g. the following extracts the control
file, which I believe are the files that are appended to create /var/lib/dpkg/status
:
dpkg --ctrl-tarfile (path to deb) | tar xvf - ./control
@evanj thank you, didn't know that possibility even exists, I will definitely try that one out and report back.
@evanj this worked like charm, thank you again!
for others that maybe looking in this issue, this is how I did it:
RUN cd /tmp && \
apt-get update && \
apt-get download \
# .NET Core dependencies
libc6 \
libgcc1 \
libgssapi-krb5-2 \
libicu63 \
libssl1.1 \
libstdc++6 \
zlib1g \
&& \
mkdir -p /dpkg/var/lib/dpkg/status.d/ && \
for deb in *.deb; do \
package_name=$(dpkg-deb -I ${deb} | awk '/^ Package: .*$/ {print $2}'); \
echo "Process: ${package_name}"; \
dpkg --ctrl-tarfile $deb | tar -Oxvf - ./control > /dpkg/var/lib/dpkg/status.d/${package_name}; \
dpkg --extract $deb /dpkg || exit 10; \
done
Would it make sense to include @616b2f's example as an example in the repo itself? At least for now until some distroless itself provides some extensible way of shipping additional libraries.
Currently the only example that includes a build stage is this Dockerfile, which while it does require a compiler to build some code, doesn't need to retain any of the installed libraries in the final distroless image.
This would also be helpful for other languages that require some system dependency (even java has a few libraries that should be available for jni).
@Sineaggi would make sense in my opinion. You can also copy a bigger example that may also solve another issue (that we don't have any dotnet distroless container anymore here). That's what I use to build a dotnet distroless container at the moment:
https://github.com/616b2f/distroless-dotnet/blob/main/Dockerfile
Slight note for anyone building off the base images (e.g. base-debian11) instead of cc or core: the libssl1.1
package control file is provided as status.d/libssl1
rather than using the correct package name of libssl1.1
, so there's a subtle gotcha that vulnerability scanners will see the older file and any vulnerabilities there unless removed in favor of a manually deployed version.
From what I understood, the problem faced here is to install a .deb package into the distroless image. I have few questions @evanj @616b2f
dpkg --install -recursive /tmp
to install all the packages in the target directory (where .deb were downloaded)?COPY --from
. The docs of dpkg contain the --root
flag which might be exploited (out of my competence though). I think the idea is to "let the system think that the target directory is the root directory, e.g. '/', and run the dpkg utility inside it". Later we could just do something like COPY --from=deb_extractor /dpkg /
to translate the installation into the distroless image.You can hack this using a multi-stage Docker build. It requires manually figuring out what Debian packages to include though. Here is an example I have that installs graphviz and its dependencies:
https://github.com/evanj/pprofweb/blob/aa0a1a2e87be02f1527f334620c4d45f6fb6ffd4/Dockerfile#L3-L7
... It is possible I should add something like this to the examples in distroless, because I've seen this asked a few times.
how do you figure out the dependencies tree?
right now, i start with a clean one. apt list --installed
before and after and get a diff. but this process is cumbersome.
@tuananh I wrote up a small example here but the tl;dr is you need to first generate the list of dependencies:
$ apt-cache depends libpq5
libpq5
Depends: libc6
Depends: libgssapi-krb5-2
Depends: libldap-2.4-2
Depends: libssl1.1
Then pass those into apt-cache download
.
It's a bit cumbersome because you'd have to subtract any dependency from apt-cache depends from what the base image would already contain.
In the future I imagine distroless itself could be made as a dependency to other bazel builds, or some parts as a cli that could be used to build images from dependency lists.
@Sineaggi i think we need to use recurse
flag as well
https://gist.github.com/tuananh/1e8e0f921410a830a7cd1161ff8bb189
usage: ./aptdeps.sh bash krb5-user etc...
#!/bin/bash
set -eu
declare -a all_deps=( )
for pkg_name in "$@"
do
declare -a deps=$(apt-cache depends -i --recurse $pkg_name | awk -F 'Depends: ' 'NF>1{ sub(/ .*/,"",$NF); print $NF }' | sort | uniq)
all_deps+=$deps
done
printf '%s\n' "${all_deps[@]}" | sort | uniq
@tuananh, apt-cache depends
accept several package names, so there's no need to do a loop
Any idea how to resolve the virtual packages instead of showing them?
Hi all,
Looking into using these images in my own development workflow but I'm finding it hard to see how to add in specific shared libraries where it's necessary for a single app image without polluting the base image for example.
As a more concrete example, one of the apps has a dependency on
libparquet-dev
and so I want to include it in that apps image. However I don't want to go down the route of including it in the base image as I feel the base image should be as general as possible (that's the correct way of thinking right?)Any help appreciated!
AP