coreos / rpm-ostree

⚛📦 Hybrid image/package system with atomic upgrades and package layering
https://coreos.github.io/rpm-ostree
Other
871 stars 195 forks source link

RFE: persistent akmod cache #4500

Open akvadrako opened 1 year ago

akvadrako commented 1 year ago

I would like to propose a persistent akmod cache for rpm-ostree. Without it, after installing akmod-nvidia, every modification to a deployment takes a few minutes.

On a normal Fedora install, the kmod packages build by akmod are cached in /var/cache/akmods. As long as the kernel and package version match, reinstalls are quick. But in Silverblue, every modification to a deployment starts from scratch and the kmod needs to be rebuilt.

Host system details

State: idle
Deployments:
  fedora:fedora/38/x86_64/silverblue
                  Version: 38.20230714.0 (2023-07-14T01:48:19Z)
               BaseCommit: d49f502d3d7e74bebdd0832f64192ae4ad71965d0ba17ff5cf6ca24deee87940
             GPGSignature: Valid signature by 6A51BBABBA3D5467B6171221809A8D7CEB10B464
                     Diff: 73 upgraded, 36 removed, 83 added
      RemovedBasePackages: firefox firefox-langpacks 115.0-2.fc38 gnome-tour 44.0-1.fc38
          LayeredPackages: akmod-v4l2loopback alacritty distrobox foomatic gpm intel-media-driver iwd langpacks-en libva-utils nvidia-vaapi-driver snapper
                           strace supergfxctl tlp tlp-rdw v4l2loopback vdpauinfo xorg-x11-drv-nvidia-cuda xorg-x11-drv-nvidia-power
            LocalPackages: rpmfusion-free-release-38-1.noarch rpmfusion-nonfree-release-38-1.noarch

Expected vs actual behavior

In the system log:

Jul 12 12:55:20 orac rpm-ostree[627658]: Executed %post for akmod-nvidia in 143420 ms

Expected:

Jul 12 12:55:20 orac rpm-ostree[627658]: Executed %post for akmod-nvidia in 420 ms

Steps to reproduce it

  1. rpm-ostree install akmod-nvidia – this takes a few minutes, which is normal
  2. rpm-ostree install gpm (or any other package) – this again takes a few minutes, which is not normal

Would you like to work on the issue?

I would be willing to if the PR would be accepted.

Related discussion: https://discussion.fedoraproject.org/t/what-about-an-akmod-cache-for-silverblue/85664

cgwalters commented 1 year ago

On a normal Fedora install

s/normal/dnf-based/ e.g.

Saying "normal Fedora" implies what we're doing is not-normal, which is slightly pejorative. So please avoid that term.

the kmod packages build by akmod are cached in /var/cache/akmods

Part of the big investment we've done in rpm-ostree is to avoid "hysteresis"; silent hidden state. Every single change starts by constructing a new filesystem tree.

We do cache the package filesystem trees.

Certainly I could imagine just exposing /var/cache/akmod persistently. But the thing is that rpm-ostree's caching is heavily integrated with its "transactional deployment" model. If we just fork off akmod, we don't have that integration.

Bigger picture, I do think actually what people want for situations like yours (looking at how you've removed some packages and layered a not small set of others) is to do the container-native flow instead.

Anyways, I could imagine exposing some sort of config option like

echo PersistentPostMounts=/var/cache/akmod >> /etc/rpm-ostreed.conf

if someone showed up to do a patch, but again IMO the major focus for the development team now on this project is the container flow.

akvadrako commented 1 year ago

We do cache the package filesystem trees.

Certainly I could imagine just exposing /var/cache/akmod persistently. But the thing is that rpm-ostree's caching is heavily integrated with its "transactional deployment" model. If we just fork off akmod, we don't have that integration.

In this case, I do just want to cache the filesystem tree, but after the post-install script. That would be a more general solution that isn't akmod specific and doesn't introduce hidden state.

Bigger picture, I do think actually what people want for situations like yours (looking at how you've removed some packages and layered a not small set of others) is to do the container-native flow instead.

As I understand, the advantage of the container flow is that one can introduce intermediate steps between rpm-ostree compose tree and deploy where I can mount /var/cache/akmod during a rpm-ostree install akmod-nvidia. That makes sense.

echo PersistentPostMounts=/var/cache/akmod >> /etc/rpm-ostreed.conf

if someone showed up to do a patch, but again IMO the major focus for the development team now on this project is the container flow.

So does that mean that the container flow will replace the non-container flow or is there a reason to stick with the current method? I wouldn't mind creating a patch if the non-container flow will stay as the default option.