RFE: container-native rpmdb format

cgwalters commented 2 years ago

The way OCI/Docker containers work is a series of layers. In overlayfs, modified files are "copied up".

There are two issues:

rpms have a lot of metadata, so the rpmdb can be of nontrivial size
The current main rpmdb format (sqlite) being a single big file means that it gets duplicated in each derived image

In the common case of something like e.g.

FROM registry.fedoraproject.org/fedora:35
RUN dnf -y install cowsay && dnf clean all

The resulting tar layer from the RUN command has an entirely duplicated copy of the rpm database, just with cowsay and its dependencies added.

It's quite common for builds like this to actually form the base image for further images - and this duplication gets compounded.

A strawman proposal here is something like /usr/lib/sysimage/rpmdb.d with something simple like zstd-compressed JSON files storing data instead. The files could be named something simple like 0000.json.zstd and then later changes which add packages add just a new 0001.json.zstd file or so. Or, we could keep sqlite and union those; I don't have a really strong opinion. (Well, not really "union" literally but more "merge", since we should support removing or upgrading packages from prior layers)

voxik commented 2 years ago

This is either related or duplicate of #1885.

cgwalters commented 2 years ago

You're right, this overlaps a lot with previous threads, however, this one is much more about OCI/Docker containers than the previous ones.

DemiMarie commented 2 years ago

There are two approaches to fix this:

Use a storage engine that provides block-level copy-on-write, rather than file-level copy-on-write. BTRFS, ZFS, and device-mapper satisfy this requirement, as does overlay2 on a filesystem supporting reflinks. This does not help image layer sizes, however.
Use a one-file-per-entry approach.

lnussel commented 2 years ago

The ability to overlay a tree with extra packages is also part of the motivation for #1959. Meanwhile a q&d solution for containers would be to add eg a plugin to dnf that dumps the database after the transaction and then deletes the database. On startup it could import those headers again if no db exists yet (cat *|rpmdb --importdb).

https://github.com/lnussel/toy/blob/master/dumpheaders.c stores rpm headers in separate files in contrast to rpmdb --exportdb. In #1959 are also commits that allow use of packages (with or without payload) for that purpose.

rpm-software-management / rpm

RFE: container-native rpmdb format #2005