GoogleContainerTools / distroless

🥑 Language focused docker images, minus the operating system.
Apache License 2.0
18.71k stars 1.14k forks source link

[Question] What is the best practice to add a custom Debian package to the distroless image? #1321

Open LucaMaurelliC opened 1 year ago

LucaMaurelliC commented 1 year ago

I wish to install some database' clients, e.g. mssql or postgres, how can I add them to one of the distroless images?

loosebazooka commented 1 year ago

What distroless does with debian packages is extract data.tar.xz from the .deb and add it to the container:

  1. If you're comfortable using bazel, you can do it using rules_oci and append a new layer extracted from the deb. (data.tar)
  2. If not, you can do the same using a dockerfile. (ADD data.tar)
LucaMaurelliC commented 1 year ago

I'm educatin myself about debian from here: https://www.debian.org/doc/manuals/debian-faq/pkg-basics.en.html

This is what I'm thinking, please let me know if my idea is right:

  1. Extract the .deb archive: ar x package.deb
  2. Extract the control archive. This should contain the package metadata and installation scripts: tar -xf control.tar.xz
  3. Extract the data archive. This should contain the actual files of the package: tar -xf data.tar.xz
  4. Copy files to their appropriate locations (how? Should I exploit the control file)? cp -R usr/* / something like this
  5. Should I run the database update? sudo updatedb
chris-harness commented 1 year ago

@LucaMaurelliC What you're attempting doesn't really fit the intent of Distroless images, which best supports use cases where a single app is built, and the resulting files then copied on top of the Distroless base image directly. You'd probably have a much easier time if you simply use a lightweight and well-maintained base image, like Alpine or Ubuntu.

You could theoretically do what you're asking on top of Distroless, but it would be a lot of work. You'd probably want to install the necessary packages and all of its dependencies in the standard Debian image as part of an intermediate stage in the Dockerfile. You'd then need to figure out EVERY file that was created by all those package installs and COPY each one of them into the final Distroless stage of your Dockerfile, in exactly the same places. Even after all of that, there might still be some issues that would need to be figured out.

LucaMaurelliC commented 1 year ago

Thanks @chris-harness for the insight. At the moment, I am using the official python slim image based on debian to run my "single application", which is basically python code (managed with a pip package manager) and the mssql driver to handle the database within the app itself. I can copy the app files by installing the python packages in a virtual enviroment: this is already done. The SQL database driver, instead, is installed separately by using system-level package, in my image with Debian utilities (e.g. apt-get). Do you think this configuration is not suited (and should not be suited) with distroless images?

53845714nF commented 9 months ago

Very interesting discussion. i would like to install libmagic1 in an image. And I could imagine that there are often libs in python that use packages of the OS.

ptrba commented 9 months ago

https://github.com/bazel-contrib/rules_debian_packages does the job. In my case I added the mysql client to the distroless python base image as follows:

oci_image(
    name = ...,
    base = "@distroless_python",
    entrypoint = ...,
    tars = [debian_package_layer("default-mysql-client"),...],
)