ublue-os / bluefin

The next generation Linux workstation, designed for reliability, performance, and sustainability.
https://projectbluefin.io
Apache License 2.0
838 stars 129 forks source link

Bluefin with Surface kernel fails to update due to OSTree error "Multiple kernels found" #1357

Open SvdB-nonp opened 4 weeks ago

SvdB-nonp commented 4 weeks ago

Describe the bug

When I start the application "System Update", the subsequent update fails with error message error: Committing: Multiple kernels found in /usr/lib/modules

Terminal output during the failing update:

── 13:10:27 - System update ────────────────────────────────────────────────────
note: automatic updates (stage) are enabled
Pulling manifest: ostree-image-signed:docker://ghcr.io/ublue-os/bluefin-surface:gts
Checking out tree 052041c... done
Writing OSTree commit... done
error: Committing: Multiple kernels found in /usr/lib/modules
System update failed: 
   0: Command failed: `/usr/bin/rpm-ostree upgrade`
   1: `/usr/bin/rpm-ostree` failed: exit status: 1

Location:
   src/steps/os/linux.rs:273
Retry? (y)es/(N)o/(s)hell/(q)uit

The location mentioned in the error message contains the following files:

ls -lart /usr/lib/modules
total 0
drwxr-xr-x. 1 root root 770  1 jan  1970 6.8.8-1.surface.fc39.x86_64
drwxr-xr-x. 1 root root 834  1 jan  1970 ..
drwxr-xr-x. 1 root root  54  1 jan  1970 .

What did you expect to happen?

The update finishes successfully without errors.

Output of rpm-ostree status

State: idle
AutomaticUpdates: stage; rpm-ostreed-automatic.timer: no runs since boot
Deployments:
● ostree-image-signed:docker://ghcr.io/ublue-os/bluefin-surface:gts
                   Digest: sha256:1306f0d05c5ce1a6b8be69793a22493cc89f3c285597a94b6fd9643ec5eb0639
                  Version: 39.20240525.0 (2024-05-25T16:49:38Z)
             InitramfsEtc: /etc/crypttab /etc/modules-load.d/ublue-surface.conf

  ostree-image-signed:docker://ghcr.io/ublue-os/bluefin-surface:gts
                   Digest: sha256:3eb6f5ff8e913046544f8f6fdf78b2ec4f0d80ebedf99ef019e1229e18a24d50
                  Version: 39.20240524.0 (2024-05-24T16:49:28Z)
             InitramfsEtc: /etc/crypttab /etc/modules-load.d/ublue-surface.conf

Output of groups

stefanvandenberg wheel

Extra information or context

Links

anchiornis commented 4 weeks ago

This issue also happens with the Surface version of Aurora using the Surface Pro 7+. It has the same error message and also only one folder in /usr/lib/modules.

When I install specific versions (e.g. rpm-ostree rebase ostree-image-signed:docker://ghcr.io/ublue-os/aurora-surface:40-20240525) this error does not appear for all versions that I tested until version 40-20240525. For the versions after that, the error appears again.

vibuz commented 4 weeks ago

The surface images are broken at the moment. The build process for reasons unknown to me fails to remove the stock kernel, resulting in a /usr/lib/modules directory with multiple kernels which rpm-ostree then refuses to use. This is probably due to some upstream change.

anchiornis commented 4 weeks ago

Thank you for your reply -- do you by any chance know (or can make an educated guess) at how long this problem might persist, and if there are any possible solutions one could do locally for this that would allow Surface users to update? (just wondering if I should temporarily switch to something where I can update / try to fix this on my computer or whether this is simply a matter of waiting a week or two)

castrojo commented 4 weeks ago

Not sure what's going on here, but it's not something obvious or something we've encountered the entire time we've been making this so hoping to reach out to as many people as we can to take a look.

SvdB-nonp commented 4 weeks ago

For some reason the update just now was successful... so it indeed may be an upstream bug fixed in the meantime. Nothing seems different considering the kernel version mentioned in /usr/lib/modules compared to previously.

I did not save the actual update log, but this is the new ujust device-info output: https://paste.centos.org/view/37415865

vibuz commented 4 weeks ago

I've gone through (aurora|bluefin)-surface:(latest|gts) and can confirm that the problem is no longer present on any of these images.

castrojo commented 4 weeks ago

Whew. 😄

Let's keep this open as a signpost, as we grow it'd be great to have technical folks more involved with linux-surface so we can help out.