Closed dmnks closed 1 year ago
So, to summarize, the goal of this ticket is to merge mktree.fedora and mktree.podman such that a host-specific Dockerfile.
is used to build the image, instead of a mktree. script that does that from scratch. It also has to support reusing the local build artifacts for local development. This way, we'll delegate the image creation process to the appropriate tooling where it really belongs. We'll only need to keep a (bunch of) Dockerfiles which are well-defined and easy to maintain.
:+1: to this from a high-level viewpoint, the Dockerfile is widely understood, lets not reinvent a wheel we don't have to. The other important point here is to avoid tests behaving differently locally and in CI (AIUI CI didn't need the dbus-libs tweak in 4712c9c37bbf28f56f1e386df788dac440cc4cb8 but local usage does, and that's unwanted double-maintenance if nothing else)
Yup, just the fact that we currently have two distinct ways to create a test image is awkward and just attracts all kinds of bugs :laughing:
Yep, to me that'd be among the top issues to address.
Ack, it's a confirmed bug now and will be dealt with like any other.
I've come to realize that, by losing host-specific mktree scripts (e.g. mktree.fedora
), we would also lose a couple of nice features along with it, namely:
The recently added make env
target runs a shell on the host which mounts a test root at $RPMTEST
that can be used in the same way as in a normal test, i.e. to run RPM in it with runroot rpm
and to inspect/modify the filesystem with the host tooling.
This, quite conveniently, also applies to the software installation in the test tree. One can now use the host native package manager to extend the tree when doing interactive testing with make env
. In fact, on Fedora, make env
defines an alias which basically amounts to dnf --installroot=$RPMTEST
to make this process easier. Naturally, the full command is identical to the one needed to bootstrap the tree in the first place.
By contrast, the Dockerfile native way is flipped inside out. It assumes that software is installed from within the base image which ships with the native package manager from the start. However, that's not compatible with our needs since we're actually developing a component in the very same packaging stack.
Therefore, separating the management of the test tree from the actual testing makes more sense. It's also why the test logic has always run outside of the test tree, not inside of it.
So, by going full Podman/Docker for image management, we would lose the host-specific wrappers (dnf --installroot=$RPMTEST ...
on Fedora) in make env
. Or, we could keep the wrappers, but at that point, we might as well reuse those wrappers to actually setup the tree and avoid the extra Podman step, putting us back to where we are now :smile:
By bootstrapping the tree from scratch, we can control what software goes into it without resorting to a dnf remove
or rpm -e --nodeps
in an equivalent Dockerfile. Most notably, as mentioned in the description above, this pertains to the RPM installation itself which we need to purge before installing the local one.
Furthermore, if we were to go even further and use Podman/Docker to also spawn the wrapping containers (i.e. mktree.podman
being the default), we would also lose this in local development:
This is currently very useful in make shell
and make env
as it avoids the need to throw away your changes to the container each time you make a source change in RPM.
By using Podman/Docker to spawn the container, we would have to either make install
the new artifacts over the ones already present in the container or rebuild the image, thus having to recreate the container too.
While the former (redo make install
) may not be a problem in most cases, it certainly isn't as clean as simply recreating the installation directory from scratch, which is what we currently do in mktree.fedora
.
The latter (recreate the container on each change) would seriously hamper any useful container-based development.
The other important point here is to avoid tests behaving differently locally and in CI (AIUI CI didn't need the dbus-libs tweak in 4712c9c but local usage does, and that's unwanted double-maintenance if nothing else)
Actually, adding dbus-libs
to the Dockerfile wasn't needed because it already contains it :smile: It gets pulled in as a dependency of dbus-devel
which we install there in order to build RPM (it's not needed in mktree.fedora
obviously).
And yes, that's still double-maintenance in a sense, but it's due to the fact that we do not need build deps of RPM in the local testing scenario, as those are installed on the host (RPM is built on the host). So, not really much of a problem.
One way to fix that would be to simply build and test with the mktree.fedora
backend even in the CI, but from a Fedora container, due to the build deps. The downside of that would be increased execution time in CI since we would have to 1) build the Podman image for building RPM and then 2) setup the test tree from scratch with DNF, just like on the host.
That's basically the reason for the existence of mktree.podman
which reuses the build image also for running the tests. Basically, it's an optimization for CI, nothing else. It's not suitable for local development, due to the reasons outlined in my previous comments, though.
To summarize again, these are our options to resolve the ticket:
Image building:
Container layering:
Pros:
make shell
development, without needing to throw the container away on each rebuildmake env
that includes a DNF/Zypper/... wrapper for managing software in $RPMTEST
(same wrapper as used to build the image)Cons:
Image building:
Container layering:
Pros:
Cons:
make env
(no host specific package manager wrapper, needs manual use, e.g. dnf --installroot=$RPMTEST
works but is not optimal, needs more options for local cache reuse or to suppress warnings, all of which is part of mktree.fedora
already)Image building:
Container layering:
podman run
)Pros:
Cons:
make env
(mounting $RPMTEST
would require a different containerization stack than what's used in the test-suite)make shell
(user changes to the container are dropped on each RPM rebuild)podman image mount
or podman unshare
In terms of usefulness/viability, the worst is option 3, followed by option 2 and then 1.
Basically, the biggest drawbacks of 2) and 3) are in the area of local development with make shell
and make env
. These are new features that weren't there before the recent test-suite rework, so making them worse wouldn't be the end of the world.
However, these features also are some of the niceties that the test-suite rework has allowed for, and throwing them away would be a pity.
Problem
It turns out that, while this is quite easy to do on Fedora with DNF and
unshare(1)
, it is not as easy on other distros that I've tried, namely OpenSUSE where Zypper doesn't seem to like being run throughunshare(1)
, and would likely requiresudo
instead
Hmm, not sure what I did wrong when trying this out originally, but now I've just tried again by doing an unshare -rm --mapauto zypper --root ...
and it seemed to work just fine and installed the filesystem into the specified directory.
Ah, OK, I think this is the error I saw initially (and now too):
error: unpacking of archive failed on file /dev: cpio: chown failed - Device or resource busy
error: filesystem-84.87-12.1.i586: install failed
( 4/27) Installing: filesystem-84.87-12.1.i586 .........................................................................[error]
Installation of filesystem-84.87-12.1.i586 failed:
Error: Subprocess failed. Error: RPM failed: Command exited with status 1.
It's still possible to select "ignore" in the prompt and continue without any further errors, though.
Anyway. Having slept on this a couple of times, it's become clear that we just don't want to deal with all this bootstrapping business and basically re-implement mkosi
(which we can't use for other reasons as outlined above).
So, circling back to where we started, all we really need is outsource the base image creation to OCI and be done with it. That means, option 2 from the comment above is what I'm going to implement.
Thanks for your attention :laughing:
Actually, adding
dbus-libs
to the Dockerfile wasn't needed because it already contains it š It gets pulled in as a dependency ofdbus-devel
which we install there in order to build RPM (it's not needed inmktree.fedora
obviously).
Oh but it wasn't added to the Dockerfile but mktree.fedora, where on that third (?) level of inception dbus-devel is not installed because that's just rpm's own build-dependency whereas mktree.fedora prepares an environment for just the test-suite. :smile:
One could also look at that from the point of separating rpm's build-requires and test-requires, but it doesn't seem meaningful. We'll want to be able to do stuff like use cmake to compile something against rpm in the test-suite (but that's yet another story). The test-environment equaling the build-environment seems the path of least maintenance. As with the current CI environment, AIUI. OTOH I could also be utterly confused :zany_face: :smile:
Edit: I remember now (after re-reading some of the commentary in this ticket): the mktree.fedora case was needed to allow testing the locally built binary against that distro version, whereas the CI case is different. That was how and why it was so special. I'll just stock on some popcorn and sit back to see what happens in this space :grin:
Oh but it wasn't added to the Dockerfile but mktree.fedora, where on that third (?) level of inception dbus-devel is not installed because that's just rpm's own build-dependency whereas mktree.fedora prepares an environment for just the test-suite. š
Yup, I can see how this is confusing, and that's one of the reasons I want to simplify this, by only having one way of creating the image.
Nevertheless, the idea of a mktree.distro script was to bootstrap an image with only the runtime and test deps, since the build deps are irrelevant to the test-suite. That's why you needed to add dbus-libs explicitly to the DNF command line there.
The Dockerfile (and mktree.podman) only existed as a universally portable variant of the above which also builds RPM in a container. And, as an optimization, it reuses the same image for the testing too, instead of setting up a separate tree with mktree.fedora afterwards, because the Dockerfile is based on Fedora and thus already contains the runtime deps for RPM :smile:
One could also look at that from the point of separating rpm's build-requires and test-requires, but it doesn't seem meaningful. We'll want to be able to do stuff like use cmake to compile something against rpm in the test-suite (but that's yet another story). The test-environment equaling the build-environment seems the path of least maintenance. As with the current CI environment, AIUI. OTOH I could also be utterly confused š¤Ŗ š
Confused perhaps (can't blame you, lol), but also on point. Indeed, our goal is to equal the build and test environment in terms of the underlying libraries being ABI compatible (i.e. you build for library 1.X.Y and you run with library 1.X.Y), but not necessarily in terms of the actual root filesystem.
The simplest way to do this is sudo make install
and then sudo ./rpmtests
. But we don't want to force the poor developer to dump their RPM snapshot into their workstation system (and to risk getting their files purged on a forgotten "rm -rf" somewhere in the test-suite) so we separate the test environment through containerization :smile:
Slightly off-topic:
One way to avoid the image management altogether would be to simply reuse the root filesystem on the host as the lowerdir in OverlayFS terms (and layer an RPM install on top). However, for that, one needs real root privileges. Plus, we wouldn't have full control over that test environment since there could be random stuff installed on the host and possibly interfere with the tests and/or cause weird failures.
Another way would be to reflink the necessary libraries from the host. That would be limited by the filesystem in use, e.g. btrfs or xfs support that but ext doesn't. Also, we would have to somehow obtain the files to reflink, which could possibly be done with something like rpm -ql libfoo
, however that still would not guarantee a 1:1 copy of the given package since there are scriptlets which wouldn't be run. If we added a feature to RPM to "clone" or "rebase" a package to a different root filesystem, that could be used :smile: But that's really not something to worry about now.
EDIT: Of course, we could always fall back to a plain cp
on non-reflinkable filesystems, which in my earlier experiments is still faster than DNF/Dockerfile builds. But again, it's not as straightforward so I didn't explore that further :smile:
We'll want to be able to do stuff like use cmake to compile something against rpm in the test-suite (but that's yet another story).
Oh, this is actually a very good point. It's not something I had in mind previously, but it's yet another reason to just reuse the same Dockerfile image for both building and testing, indeed :smile: (which is in line with the planned solution of this ticket).
Edit: I remember now (after re-reading some of the commentary in this ticket): the mktree.fedora case was needed to allow testing the locally built binary against that distro version, whereas the CI case is different. That was how and why it was so special. I'll just stock on some popcorn and sit back to see what happens in this space š
Ack, yes, that's the thing. We want both portability and integration with a local build for native development.
And yup, popcorn is in order, indeed :laughing:
Oh, Ubuntu 22.04 LTS (our lowest common denominator) ships Podman in the official repos. Nice. We can then drop Docker support altogether and just use the same backend for local and CI purposes. Most likely (still need to verify that the Podman version in Ubuntu works the same way).
Oh, that's a nice bonus. Assuming of course it actually works :smile:
Keep in mind, we need a way to run it directly on the host, because all this fanciness you're talking about doesn't exist on non-Linux platforms. In particular, I would like to be able to run the test suite for RPM on macOS still. š
I will also point out there are openSUSE containers that use DNF too. š
Keep in mind, we need a way to run it directly on the host, because all this fanciness you're talking about doesn't exist on non-Linux platforms. In particular, I would like to be able to run the test suite for RPM on macOS still. š
In essence, what the test-suite needs is a filesystem tree that contains an installation of RPM to test and its runtime dependencies (i.e. a make install
as root in a development VM would do just fine), and a way to quickly make writable copies (e.g. copy-on-write snapshots) of that tree. Then, a plain chroot into those snapshots could be used to isolate the individual tests.
None of this is inherently Linux specific, however to make it reasonably fast and efficient, you need something like OverlayFS which is currently only available on Linux (and reportedly on some BSDs through fuse-overlayfs but that's likely slower than a native kernel implementation).
And then, instead of chroot, we want to (and do) make use of containers in our tests, now that we've freed ourselves from fakechroot. They offer much more possibilities, including proper uid/gid mapping for file ownership testing, unsharing more than just the filesystem namespace, etc.
One way to make the test-suite POSIX compatible would be to identify tests that require a proper container and disable those on platforms that don't have containers. There, chroot could be used instead. But I'm not sure if we really want to go in that direction.
Even then, while chroot is available on most UNIX platforms, OverlayFS-like functionality is not, or varies greatly among the systems (APFS on macOS comes close perhaps) and would need special handling. This really is not something we (the core development team at Red Hat) have the resources and/or expertise to cover. We rely on the community to do this work.
One thing that's not mandatory, though, is rootless operation. We can always just assume that the test-suite is run in a development/throwaway VM. That's one less thing to worry about when it comes to supporting non-Linux systems.
One more thing to add to the above: This ticket is about using OCI images for the test root creation, on Linux distributions. Any kind of non-Linux work would need to be tracked in a separate ticket.
None of this is inherently Linux specific, however to make it reasonably fast and efficient, you need something like OverlayFS which is currently only available on Linux (and https://github.com/rpm-software-management/rpm/pull/2559#issuecomment-1633335921 on some BSDs through fuse-overlayfs but that's likely slower than a native kernel implementation).
FUSE exists on macOS too, though I don't know if fuse-overlayfs
works with it.
I will also point out there are openSUSE containers that use DNF too. š
Interesting, thanks for sharing. I don't think it changes anything discussed here so far, though :smile:
None of this is inherently Linux specific, however to make it reasonably fast and efficient, you need something like OverlayFS which is currently only available on Linux (and #2559 (comment) on some BSDs through fuse-overlayfs but that's likely slower than a native kernel implementation).
FUSE exists on macOS too, though I don't know if
fuse-overlayfs
works with it.
Oh, nice. Also, see: https://github.com/containers/fuse-overlayfs/issues/140
But again, not something I'm going to invest time into. Anyone interested, feel free to investigate and/or submit patches ;)
Well, apparently this is now a thing: https://macoscontainers.org/
Yep, thanks. I noticed this too on Hacker News yesterday and was almost going to post the same here :smile:
@dmnks Next time shout at me about what's missing! I Would have been happy to discuss improvements to mkosi to make it work for your use case. (I would have commented but had no clue you were considering it)
Oh, I'd missed your comment @DaanDeMeyer, sorry!
Yep, I indeed did consider mkosi initially and even had a branch where I played with it for a couple of weeks. However, later I realized the philosophy of mkosi wasn't completely in line with our requirements (and that's OK!), which were:
Reuse the local, native cmake build directory and install those artifacts into the target image, instead of doing a separate build on the side (like mkosi build
does). Basically, we want the test-suite to exercise the user's build directory (that they'd have anyway) rather than keeping an additional one on the side. Looking back, this isn't really super critical and actually has some drawbacks (mentioned in some tickets that I've opened here since then, heh) but that was the "happy path" workflow before moving away from fakechroot so I wanted to keep it as much as possible.
Allow for cross-building images (e.g. build a Fedora image on an Ubuntu host). This is of course the core feature of mkosi, however it requires a reasonably recent OS version (and/or mkosi) on the host, as well as ~a reasonably recent package manager of the OS you're targeting~ the right package selection for your project's dependencies. This is a hard requirement for us because we need to run the test-suite in a CI environment where we can't control the host/VM OS selection. Currently, we use GitHub Actions which only has Ubuntu 22.04 LTS and, IIRC, I had issues with the version of mkosi and the (ancient, unmaintained) RPM stack available there. And building a target image based on Ubuntu wouldn't work for us either because Ubuntu isn't RPM's main target platform and thus doesn't have the latest dependencies.
Rootless image building. I know this is already supported in mkosi for a while but it wasn't when I was considering it (so it was another factor).
So it eventually turned out that using OCI images was the best solution for us. The tooling is ubiquitous and already preinstalled on the Ubuntu VMs in the CI (as well as typical developer systems).
In retrospect, though, I would've talked to you, indeed, if just to understand the whole landscape where mkosi operates better. There certainly are nice features in mkosi that we could use, OTOH right now we're happy with the OCI setup and wouldn't gain much (if anything) by switching.
... however it requires a reasonably recent OS version (and/or mkosi) on the host, as well as a reasonably recent package manager of the OS you're targeting.
Oops, I got this wrong:
Mkosi doesn't require the target OS's package manager. It uses the native one, whichever it is on the host you're building the image on. The issue I wanted to get across was that the package selection is still tied to the native package manager (e.g. APT on Ubuntu).
Argh... :facepalm: :smile: re-reading mkosi's man page again, it of course supports multiple Distribution=
values... meaning that, if that distro's package manager is available on your host, it'll be used. Either way, the point still kinda holds :smile:
There have indeed been quite a few improvements to mkosi lately so I understand that it might not have been suitable when you were working on this.
Note that when it comes to building cross building, we have CI in mkosi that verifies that verifies that mkosi can do cross distribution image builds for all supported distributions. The only combos that aren't directly supported are building Arch/Ubuntu/Debian images from OpenSUSE as they don't package apt or pacman. However, mkosi also supports so called tools trees, where it first builds an image with all necessary tools and then uses that image to build the actual image. So on OpenSUSE, you can build a Fedora tools tree and use that to build all other supported distributions. This is what we use in the systemd repository to build images on the Github Actions CI runners. We simply configure ToolsTree=default
and ToolsTreeDistribution=fedora
and mkosi will first build a Fedora image with all the latest and greatest tooling and then use that to build the actual image. Of course you still need the package manager on the host system to build the tools tree, but mkosi's CI makes sure that we're notified whenever something breaks in that area.
Re-using the local build directory might indeed be somewhat more difficult, since you would need to trick CMake into not wanting to reconfigure and rebuild when installing. But there's also no guarantee that the local build directory would actually work unless the dependencies are roughly the same as on the host.
Anyway, feel free to email me if you ever feel the itch to switch to mkosi.
Background
The
rpmtests
script, once built, is designed to be run as root and exercises the RPM installation in the root filesystem. When an individual test requires write access to the root filesystem (e.g. to install some packages), a writable snapshot is created with OverlayFS and a lightweight, fast container is spawned with Bubblewrap on top of it, running RPM or some other tool. This is to prevent the individual tests from affecting each other. All this logic is wrapped in the shell functionsnapshot()
inatlocal.in
.When hacking on RPM, it's typically not desired to install the build artifacts into the native root filesystem, though, so
make check
wraps therpmtests
script in a container of its own. This container runs on top of an OS image that mirrors the host and contains only the necessary runtime and test dependencies, in the versions matching the development headers used during the build. RPM ismake install
-ed on top of that filesystem to produce the final image. The parent container withrpmtests
is typically spawned using the snapshot functionality built into the test-suite, with the lower layer being the image directory instead of/
.To build such an image, a host-native
mktree
script ("backend") is invoked bymake check
. It is supposed to use whatever the native package management tooling on that platform (Linux distribution) is, e.g.dnf --installroot
on Fedora or thezypper
equivalent on OpenSUSE. This is ideally done as an unprivileged user, with the use of Linuxnamespaces(7)
.Currently, we only include
mktree.fedora
but the idea was to gradually add more such backends for the platforms where RPM is typically built and developed, at least for OpenSUSE and perhaps (experimentally) for Debian/Ubuntu too.Problem
It turns out that, while this is quite easy to do on Fedora with DNF and
unshare(1)
, it is not as easy on other distros that I've tried, namely OpenSUSE where Zypper doesn't seem to like being run throughunshare(1)
, and would likely requiresudo
instead. The same I've observed with Debian anddebootstrap
. While not a dealbreaker per se, we should really try to avoidsudo
for something as trivial as amake check
.Another drawback of this approach is that we suddenly find ourselves in the business of maintaining distro-specific scripts where each needs its own set of tricks and workarounds, such as having to inject a bunch of RPM macros to DNF to make it behave properly for our purposes. This does not scale well and just makes our life harder in the long run as the packaging stacks evolve.
In fact, mkosi does almost what we need as it abstracts all these distro-specific details away from the user. In the latest version, it even works without root privileges and apparently can now run the build script natively, meaning that the local build directory produced by CMake could possibly be used.
However, there are still some other limitations preventing us from considering
mkosi
, such as the inability to run within a container, thus making it less portable. That is something we need for our CI environment where we typically have a limited choice of operating systems (Ubuntu 22.04 LTS in GitHub Actions currently) and thus may or may not have all the build and runtime dependencies for the latest development snapshot of RPM available in the official distribution repositories.Also, the core philosophy of
mkosi
is more like "wrap the build system in a container and act as the primary interface for building/testing" whereas we'd prefer it the other way around, i.e. "seamlessly integrate into our existing build system and reuse the local build artifacts".Hence,
mktree
was born, as a poor-man's version ofmkosi
tailored to our use case, almost dependency free, with some shamelessly stolen ideas and naming conventions frommkosi
, but having the issues mentioned.Solution
As it happens, there already is a standardized way of distributing container images, namely OCI, or more commonly known as Podman and Docker images. The public registries contain a lot of different Linux distributions, certainly those that we care about. And most developer workstations likely have Podman or Docker installed already.
In fact, we already use these through
mktree.podman
and ourDockerfile
. This backend currently acts as a fallback for non-Fedora distros and is our go-to backend in the CI. As opposed tomktree.fedora
,mktree.podman
uses Podman/Docker to also spawn the parent container from the image. This is obviously the most natural way once you're already using Podman/Docker to build the image.The only downside of these premade images is that, in the case of Fedora or OpenSUSE, they, well, contain a stock installation of RPM already, and we would like to get rid of it before planting our own. Which is what we do in our
Dockerfile
, by simply self-destructing RPM withrpm --nodeps -e rpm ...
. This is ugly as it creates an additional layer internally in Podman/Docker, but hey, it works. In the future, we could always publish our own, pristine OCI image for RPM testing (basically the one that we currently build withmktree.fedora
), but that's really out of scope here :laughing:So, why not just make
mktree.podman
the default backend everywhere and dropmktree.fedora
? Well, it's currently not optimized for iterativemake check
use since it doesn't reuse the local build directory (instead, it builds RPM itself as part of theDockerfile
and thus in a container, much likemkosi
). That means, you get a new CMake build from scratch on eachmake check
, and the previous layer is not even cleaned up properly which results in the layer cache growing in size over time (yikes). It also prevents us from "hot swapping" the RPM installation due to the nature of OCI layering, something that's useful inmake shell
where you don't want your changes in the container to be dropped whenever rebuilding RPM (something thatmktree.fedora
supports because it manages the layers as plain directories).Conveniently, Podman allows for mounting an image into a directory on the host, which means we can simply use that directory instead of bootstrapping our own with DNF/Zypper/etc. Thus, we could reuse the layering in
mktree.fedora
for the rest.Docker sadly doesn't support mounting images, but it does support extracting them into a destination directory, so we can fall back to that for Docker, with the small CPU and disk space penalty involved (it's still much faster than using a package manager to download the metadata, then the packages and finally install them).
Summary
So, to summarize, the goal of this ticket is to merge
mktree.fedora
andmktree.podman
such that a host-specificDockerfile.<OSNAME>
is used to build the image, instead of amktree.<OSNAME>
script that does that from scratch. It also has to support reusing the local build artifacts for local development.This way, we'll delegate the image creation process to the appropriate tooling where it really belongs. We'll only need to keep a (bunch of)
Dockerfile
s which are well-defined and easy to maintain.This change isn't targeting 4.19 since it's too late for that and also it's not something that's relevant in the distributed tarball, it's more like a feature/optimization for the developers working off of a git checkout, as well as a simplification of the whole test-suite architecture.