fedora-silverblue / issue-tracker

Fedora Silverblue issue tracker
https://fedoraproject.org/atomic-desktops/silverblue/
125 stars 3 forks source link

Boot fails with "vmlinuz has invalid signature" or "bad shim signature, you need to load the kernel first" #543

Open Mershl opened 3 months ago

Mershl commented 3 months ago

Current workaround

See https://github.com/fedora-silverblue/issue-tracker/issues/543#issuecomment-2048350047


Original issue text

Describe the bug Trying to rebase an existing SB39 to SB40 fails to boot showing vmlinuz-6.8.1-300.fc40.x86.x64 has invalid signature. you need to load the kernel first on an Thinkpad T495. I was not able to reproduce the issue on my desktop systems.

coming from deployment:

fedora:fedora/39/x86_64/silverblue
                  Version: 39.20240328.0 (2024-03-28T00:39:32Z)
               BaseCommit: a1ac8885d05ca13728d92c2bcf1ded67a7ddb409d657e446a808397366a463b1
             GPGSignature: Valid signature by E8F23996F23218640CB44CBE75CF5AC418B8E74C

going for:

fedora:fedora/40/x86_64/silverblue
                  Version: 40.20240328.n.0 (2024-03-28T08:09:28Z)
               BaseCommit: f726d0be3361a42a8ac175b08851de67a2d97c9c01ee130a3abcd32720120f9c
             GPGSignature: Valid signature by 115DF9AEF857853EE8445D0A0727707EA15B79CC

What I've tried so far

To Reproduce

  1. rpm-ostree rebase fedora:fedora/40/x86_64/silverblue
  2. systemctl reboot
travier commented 3 months ago

I've hit the same issue and I've disabled Secure Boot for now.

It's likely due to the bootloader being too old. bootupd should land in the image after the release and will then let people update their bootloader.

Or maybe we should include bootupd in Fedora 39 now so that people can update their bootloader before upgrading to Fedora 40.

Mershl commented 2 months ago

Oh Oh.

SB39 (stable) just updated (no rebase) to kernel 6.8.4 showing the very same issue.

  kernel 6.7.11-200.fc39.x86_64 -> 6.8.4-200.fc39.x86_64

breaks boot, showing

vmlinuz-6.8.4-200.fc39.x86_64 has invalide signature.
you need to load the kernel first.

(Thinkpad T495 using fedora:fedora/39/x86_64/silverblue)

HassoSigbjoernson commented 2 months ago

Or maybe we should include bootupd in Fedora 39 now so that people can update their bootloader before upgrading to Fedora 40.

And people have to be told what to do. I got a bad shim signature. you need to load the kernel first error while updating my SB39 system (no rebase to 40) and was able to fix it by copying files from /sysroot/ostree/deploy/fedora/deploy/[...]/usr/lib/ostree-boot/efi/EFI/fedora to /boot/efi/EFI/fedora but I was only able to do this because I had already encountered this problem when it affected arm64 systems a year ago.

travier commented 2 months ago

Pinging Universal Blue folks as it's likely going to impact them soon: @castrojo @noelmiller @EyeCantCU @KyleGospo

travier commented 2 months ago

Unfortunately, now that the kernel has landed in an update, we can not add bootupd to a future F39 build as you won't be able to boot it to get the fix.

Overlaying bootupd via rpm-ostree install is not enough as it needs to run a command during the commit build process.

So we have https://github.com/coreos/bootupd/issues/635 as an option (that I have not tested yet) to create container images with bootupd based on an older commit with the previous kernel.

Or we document how to manually fix this until bootupd finally lands in Atomic Desktops. Unfortunately it will not fully be in Fedora 40 by default yet. See: https://gitlab.com/fedora/ostree/sig/-/issues/1

hferreiro commented 2 months ago

Can't the update to the kernel package be reverted in the same commit that bootupd is added?

travier commented 2 months ago

No unfortunately, for Fedora Atomic Desktops, we always take the latest RPMs from the repos.

jorgeml commented 2 months ago

@travier I'm watching this one with interest as my Silverblue image is now one week old. If I understand it correctly, the only solution at the moment is to disable Secure Boot? And not even upgrading to Fedora 40 will solve it in the near future?

travier commented 2 months ago

The workaround for this issue is going to be commands that do what bootupd does but manually unfortunately.

This is mostly doing a copy from /usr/lib/ostree-boot to /boot/efi on EFI systems, and a /usr/sbin/grub2-install call on BIOS ones, with the correct arguments.

I have not tested any of this so I'm not providing ready made commands and everything is at your own risk. Please make backups and make sure that you are confortable rescuing a broken system before trying thing out.

Help with testing in VMs or on real hardware is welcomed.

I recommend disabling Secure Boot in the meantime until this is fixed.

travier commented 2 months ago

Warning: Those instructions should be safe to follow, but still, do at your own risk, make backups

Here is the set of commands I've just used to update my (x86_64) EFI booted system successfully:

# Enter a root shell on the host (i.e. not in a toolbox)
$ sudo -i

# Make a backup of the content of the EFI partition
$ cd /boot/efi/
$ cp -a EFI EFI.bkp

# Copy updated bootloader versions
$ cp /usr/lib/ostree-boot/efi/EFI/BOOT/{BOOTIA32.EFI,BOOTX64.EFI,fbia32.efi,fbx64.efi} /boot/efi/EFI/BOOT/
$ cp /usr/lib/ostree-boot/efi/EFI/fedora/{BOOTIA32.CSV,BOOTX64.CSV,grubia32.efi,grubx64.efi,mmia32.efi,mmx64.efi,shim.efi,shimia32.efi,shimx64.efi} /boot/efi/EFI/fedora/

# Only needed if it exists already on your system
$ cp /usr/lib/ostree-boot/efi/EFI/fedora/shimx64.efi /boot/efi/EFI/fedora/shimx64-fedora.efi

# Sync changes to the disk
$ sync

# Reboot

Once reboot is successful, you can remove the backup copies:

# Enter a root shell on the host (i.e. not in a toolbox)
$ sudo -i

# Make a backup of the content of the EFI partition
$ cd /boot/efi/
$ rm -ri ./EFI.bkp

# Sync changes to the disk
$ sync

Edit: Updated to add 32bits EFI binaries as well.

For aarch64, update the filenames as needed.

jorgeml commented 2 months ago

Thanks @travier ! Just did this on my machine and it worked. I was also able to reenable Secure Boot. As a positive side effect, it also looks like the UEFI dbx update was applied. That had been blocked for a while.

cgwalters commented 2 months ago

OK sorry yes, this is an embarrassing problem.

@travier and I had a chat, a few things here:

Alternatively, we can document how to run bootupd from a privileged container. (In fact, running it from the existing quay.io/fedora/fedora-coreos:stable container for example. One minor stumbling block here is that bootupd wants to act in a client/server model with systemd, but I think we can change it to not do that)

Longer term yes, the technical debt here is high and makes https://github.com/coreos/bootupd/issues/454 important to do and just turn on automatic updates, or at least automatic updates when they're needed.

Mershl commented 2 months ago

Copy updated bootloader versions $ cp /usr/lib/ostree-boot/efi/EFI/BOOT/{BOOTIA32.EFI,BOOTX64.EFI,fbia32.efi,fbx64.efi} /boot/efi/EFI/BOOT/ $ cp /usr/lib/ostree-boot/efi/EFI/fedora/{BOOTIA32.CSV,BOOTX64.CSV,grubia32.efi,grubx64.efi,mmia32.efi,mmx64.efi,shim.efi,shimia32.efi,shimx64.efi} /boot/efi/EFI/fedora/

Only needed if it exists already on your system $ cp /usr/lib/ostree-boot/efi/EFI/fedora/shimx64.efi /boot/efi/EFI/fedora/shimx64-fedora.efi

@travier I've noticed that the affected Thinkpad T495 was missing boot/efi/EFI/BOOT completely. I skipped the line in your test steps and it worked out perfectly. SecureBoot reenabled, no regressions found so far.

Looking into other systems I see the mentioned boot/efi/EFI/BOOT. Does someone know the difference, does it depend on the birth date of the Fedora/Silverblue installation?

UPDATE: upgraded two other desktop systems successfully (SB39->SB40 with enabled SecureBoot) using the test steps.

A6GibKm commented 2 months ago

Warning: Do at your own risk, only lightly tested

Here is the set of commands I've just used to update my (x86_64) EFI booted system successfully:

# Enter a root shell on the host (i.e. not in a toolbox)
$ sudo -i

# Make a backup of the content of the EFI partition
$ cd /boot/efi/
$ cp -a EFI EFI.bkp

# Copy updated bootloader versions
$ cp /usr/lib/ostree-boot/efi/EFI/BOOT/{BOOTIA32.EFI,BOOTX64.EFI,fbia32.efi,fbx64.efi} /boot/efi/EFI/BOOT/
$ cp /usr/lib/ostree-boot/efi/EFI/fedora/{BOOTIA32.CSV,BOOTX64.CSV,grubia32.efi,grubx64.efi,mmia32.efi,mmx64.efi,shim.efi,shimia32.efi,shimx64.efi} /boot/efi/EFI/fedora/

# Only needed if it exists already on your system
$ cp /usr/lib/ostree-boot/efi/EFI/fedora/shimx64.efi /boot/efi/EFI/fedora/shimx64-fedora.efi

# Sync changes to the disk
$ sync

# Reboot

Once reboot is successful, you can remove the backup copies:

# Enter a root shell on the host (i.e. not in a toolbox)
$ sudo -i

# Make a backup of the content of the EFI partition
$ cd /boot/efi/
$ rm -ri ./EFI.bkp

# Sync changes to the disk
$ sync

Edit: Updated to add 32bits EFI binaries as well.

I tested these steeps on my workstation and my Thinkpad T480. I managed to get secure boot on both devices.

On the desktop I had to use fwupdmgr to update the UEFI dbx.

rugk commented 2 months ago

Has the same issue reported here (also includes a kernel bug report though I am unsure how helpful that is), so, but this question AFAIK remains unanswered:

And not even upgrading to Fedora 40 will solve it in the near future?

So does updating fix it? I mean I can boot into an old working Fedora 39 version. If I upgrade from there to Fedora 40, does that work? (Edit: tested, does not work)

Alternatively, if I need to disable Secure Boot temporarily, can I just do so – upgrade to Fedora 40 and boot successfully, afterwards? (Edit: tested, does not work, I can only boot with SecureBoot disabled)

Also, would this possibly be a candidate for a common issue if it prevents upgrading to Fedora 40?

travier commented 2 months ago

It's a good idea to make it a common issue.

It's basically impossible to fix automatically right now for F39 users updating as the new kernel already landed in Fedora 39.

We'll try to get bootupd support to a sufficiently good shape to have a small set of commands to enter manually but in the meantime, the commands in https://github.com/fedora-silverblue/issue-tracker/issues/543#issuecomment-2048350047 are the best that we have.

rugk commented 2 months ago

Okay then, I have created a draft for this here: https://discussion.fedoraproject.org/t/booting-fails-with-vmlinuz-has-invalid-signature/114354

Unfortunately I miss the technical details etc. to properly document it. So feel free to edit it please.

rugk commented 2 months ago

FYI I can confirm the workaround posted worked:

# tree EFI
EFI
├── BOOT
│   ├── BOOTIA32.EFI
│   ├── BOOTX64.EFI
│   ├── fbia32.efi
│   └── fbx64.efi
└── fedora
    ├── BOOTIA32.CSV
    ├── BOOTX64.CSV
    ├── fonts
    ├── grub.cfg
    ├── grub.cfg.old
    ├── grubenv
    ├── grubenvLa7zjw
    ├── grubia32.efi
    ├── grubx64.efi
    ├── mmia32.efi
    ├── mmx64.efi
    ├── shim.efi
    ├── shimia32.efi
    ├── shimx64.efi
    └── shimx64-fedora.efi

4 directories, 18 files
# tree EFI.bkp/
EFI.bkp/
├── BOOT
│   ├── BOOTX64.EFI
│   └── fbx64.efi
└── fedora
    ├── BOOTX64.CSV
    ├── fonts
    ├── grub.cfg
    ├── grub.cfg.old
    ├── grubenv
    ├── grubenvLa7zjw
    ├── grubx64.efi
    ├── mmx64.efi
    ├── shim.efi
    ├── shimx64.efi
    └── shimx64-fedora.efi
fbruetting commented 2 months ago

Warning: Do at your own risk, only lightly tested

Here is the set of commands I've just used to update my (x86_64) EFI booted system successfully:

# Enter a root shell on the host (i.e. not in a toolbox)
$ sudo -i

# Make a backup of the content of the EFI partition
$ cd /boot/efi/
$ cp -a EFI EFI.bkp

# Copy updated bootloader versions
$ cp /usr/lib/ostree-boot/efi/EFI/BOOT/{BOOTIA32.EFI,BOOTX64.EFI,fbia32.efi,fbx64.efi} /boot/efi/EFI/BOOT/
$ cp /usr/lib/ostree-boot/efi/EFI/fedora/{BOOTIA32.CSV,BOOTX64.CSV,grubia32.efi,grubx64.efi,mmia32.efi,mmx64.efi,shim.efi,shimia32.efi,shimx64.efi} /boot/efi/EFI/fedora/

# Only needed if it exists already on your system
$ cp /usr/lib/ostree-boot/efi/EFI/fedora/shimx64.efi /boot/efi/EFI/fedora/shimx64-fedora.efi

# Sync changes to the disk
$ sync

# Reboot

Once reboot is successful, you can remove the backup copies:

# Enter a root shell on the host (i.e. not in a toolbox)
$ sudo -i

# Make a backup of the content of the EFI partition
$ cd /boot/efi/
$ rm -ri ./EFI.bkp

# Sync changes to the disk
$ sync

Edit: Updated to add 32bits EFI binaries as well.

Thereafter, I get the following:

After rpm-ostree update:

❯ rpm-ostree update
…
…
rpm-md repo 'updates-archive' (cached); generated: 2024-05-02T02:13:06Z solvables: 45817
Resolving dependencies... done
Applying 2 overrides and 201 overlays
Processing packages... done
Running pre scripts... done
Running post scripts... done
Running posttrans scripts... done
Writing rpmdb... done
Writing OSTree commit... done
Staging deployment... done
error: Staging deployment: Cleaning deployments: Removing ostree/deploy/fedora/deploy/3fd29d12b510aabcc32bd5ab5a6262b4d1a4da43efc7a70788c11837f1afa140.0: unlinkat(classic-topbar-network-cellular-acquiring.svg): Das Dateisystem ist nur lesbar (engl.: „File system is read-only“)

At second try:

❯ LANG=en_US.utf8; rpm-ostree update
error: Remounting /sysroot read-write: Das Argument ist ungültig (engl.: „Argument not valid“)
rugk commented 2 months ago

@fbruetting seems like a different issue, OSTree updates should not have anything to do with the bootloader update. Maybe ask at the Fedora discussions forum for help?

mcejp commented 1 month ago

FWIW, I have ran into this error on Fedora Workstation 39. The root cause was that in my /etc/fstab, the /boot/efi entry was commented out (for many years, apparently). The package manager thought it was updating the EFI binaries, but instead they were going into a subdirectory of /boot.

maksimsamt commented 2 weeks ago

My story. Some time ago upgrade was successful from f39 => f40 (ostree). It worked without problems for a while. Today applied upgrade kernel from 6.8.11 to 6.9.4. After reboot got this error:

error: ../../grub-core/kern/efi/sb.c:182:bad shim signature
error: .././grub-core/loader/i1386/efi/linux.c:258:you need to load the kernel first

Secure boot enabled. Workaround https://github.com/fedora-silverblue/issue-tracker/issues/543#issuecomment-2048350047 works and now secure boot works with updated kernel 6.9.4

RobotRoss commented 2 weeks ago

Warning: Do at your own risk, only lightly tested

Here is the set of commands I've just used to update my (x86_64) EFI booted system successfully:

# Enter a root shell on the host (i.e. not in a toolbox)
$ sudo -i

# Make a backup of the content of the EFI partition
$ cd /boot/efi/
$ cp -a EFI EFI.bkp

# Copy updated bootloader versions
$ cp /usr/lib/ostree-boot/efi/EFI/BOOT/{BOOTIA32.EFI,BOOTX64.EFI,fbia32.efi,fbx64.efi} /boot/efi/EFI/BOOT/
$ cp /usr/lib/ostree-boot/efi/EFI/fedora/{BOOTIA32.CSV,BOOTX64.CSV,grubia32.efi,grubx64.efi,mmia32.efi,mmx64.efi,shim.efi,shimia32.efi,shimx64.efi} /boot/efi/EFI/fedora/

# Only needed if it exists already on your system
$ cp /usr/lib/ostree-boot/efi/EFI/fedora/shimx64.efi /boot/efi/EFI/fedora/shimx64-fedora.efi

# Sync changes to the disk
$ sync

# Reboot

Once reboot is successful, you can remove the backup copies:

# Enter a root shell on the host (i.e. not in a toolbox)
$ sudo -i

# Make a backup of the content of the EFI partition
$ cd /boot/efi/
$ rm -ri ./EFI.bkp

# Sync changes to the disk
$ sync

Edit: Updated to add 32bits EFI binaries as well.

Just encountered this on an existing F40 install, following the steps to manually upgrade shimx64.efi worked.

JAORMX commented 2 weeks ago

Just stumbled upon this on Fedora Linux 40.20240617.0 (Silverblue).

Can confirm, the workaround did the trick and am now booting back with SecureBoot enabled.

awarda-rh commented 2 weeks ago

Thanks, this also fixed the issue with me.

For context, see: https://bugzilla.redhat.com/show_bug.cgi?id=2150982 https://bugzilla.redhat.com/show_bug.cgi?id=2127995

benkei-kuruma commented 2 weeks ago

Same issue here with my five automatically updating Silverblue 40 machines. @travier 's workaround fixed it, didn't have to disable secure boot. Thank you!

For what it's worth, this is actually the first time in ~2.5 years using Silverblue exclusively that I haven't been able to boot into my OS.

TXort commented 2 weeks ago

Same issue, broke on both laptop and PC.

The script mentioned above worked. Thanks!

@benkei-kuruma Interesting, I have been using Silverblue for a few months now and after this I am thinking to go back to Workstation since it never failed me.

benkei-kuruma commented 2 weeks ago

Same issue, broke on both laptop and PC.

The script mentioned above worked. Thanks!

@benkei-kuruma Interesting, I have been using Silverblue for a few months now and after this I am thinking to go back to Workstation since it never failed me.

I hear you.

Honestly, this isn't a showstopper for me by any means. It sounds like they've already identified the boot updater issue as "severe" and are working hard to fix it. I was more giving them a compliment that an "experimental" OS like Silverblue has otherwise been so reliable for me. These devs are awesome.

To gush a bit, I absolutely adore Silverblue, it's hands down the best OS I've ever used. After years upon years of distrohopping, this is the longest I've been on one distro exclusively, to the point where I don't even read about Linux much anymore these days. To me, so many distros are basically the same thing in different wrapping paper, while Silverblue feels totally fresh and revolutionary. I love it. 🙂

aidzm commented 2 weeks ago

Seems like this issue has finally caught up to me with the update to Linux Kernel 6.9.4. Thanks to the ability to rollback, I was able to boot into my system. While the fact that I couldn't boot at all initially was a major showstopper, the ability to rollback saved the day :+1: Since I see a bright future for Atomic Desktops, I'm still going to stick with it (after all, I was half-expecting this issue to appear sooner or later). Hopefully, as a not-so-technical user, the workaround works (going to pin my current deployment first though, another major plus for Atomic Desktops haha).

Edit: It works! Thanks @travier :)

JeanLuX commented 2 weeks ago

Same issue with Fedora Kinoite 40.20240619.0 on my Asus G14 2022 with Secure Boot enabled.

No issue with 40.20240614.0

boydkelly commented 1 week ago

Updated this am to 40.20240621 and issue still persists.

awarda-rh commented 1 week ago

I don't expect this to be resolved with any 40.2024* update. Use the workaround from here to manually update the bootloader files.

boydkelly commented 1 week ago

I don't expect this to be resolved with any 40.2024* update. Use the workaround from here to manually update the bootloader files.

Really? an update that causes an unbootable (albeit you can roll back) system? I kinda thought that would be the utmost high priority?

travier commented 1 week ago

It's "impossible" to fix via an update.

aidzm commented 1 week ago

I don't expect this to be resolved with any 40.2024* update. Use the workaround from here to manually update the bootloader files.

Really? an update that causes an unbootable (albeit you can roll back) system? I kinda thought that would be the utmost high priority?

The long term solution is bootupd but it is not ready yet to be used in the Atomic Desktops. It would've been included in F40 but it was deferred because unfortunately iirc travier was swamped with work at that time.

bnordgren commented 1 week ago

Workaround above fixed my issue this morning. Going from 39.20240602.0 to 39.20240621.0.

Laserology commented 1 week ago

Workaround also fixed my issue.

travier commented 4 days ago

For folks interested in helping us with this issue, I've written a draft for an article to be published on the Fedora Magazine: https://fedoramagazine.org/?p=40664&preview=1&_ppp=2f0e5b87ab

It includes a "simpler" workaround that would benefit from a little bit more testing. I have also not yet looked at the history to find which versions where impacted first so if folks can find that it would be helpful.

Thanks!

TimonLukas commented 4 days ago

@travier Just tried it on Kinoite (through Ublue), everything worked as expected! Thank you!

miabbott commented 2 days ago

It includes a "simpler" workaround that would benefit from a little bit more testing. I have also not yet looked at the history to find which versions where impacted first so if folks can find that it would be helpful.

I did some testing this weekend using a VM and the problem seems to be introduced with the 6.9 kernel in Fedora. I found this affects both Fedora Silverblue 39 (kernel-6.9.4-100.fc39) and Fedora Sliverblue 40 (kernel-6.9.4-200.fc40).

The problematic kernel is introduced in the following versions:

The preview link to the Fed Mag post has expired, but I used the workaround information in this issue successfully.

travier commented 2 days ago

Thanks @miabbott and @TimonLukas !

travier commented 2 days ago

Updated preview: https://fedoramagazine.org/?p=40664&preview=1&_ppp=8e94781824

miabbott commented 1 day ago

@travier if you want to update the article with IoT information, it looks like the fedora/stable/x86_64/iot ref received the affected kernel as part of 40.20240617.0 (62c8ff246886838c8b5df7ca5ff060fccee8705fa7114f3ec47dad0103ac3ba9)

miabbott commented 1 day ago

@travier See https://discussion.fedoraproject.org/t/update-originated-bad-shim-error/124397

Should we include instructions about bringing Kinoite/Silverlbue 39 up-to-date before performing the workaround?

travier commented 1 day ago

Ah, that's a good point, we indeed need that :/

travier commented 1 day ago

Hum, but why did this not happen when I tested on F39? I'll test again.

juhp commented 9 hours ago

Do you want to update the preview link? (maybe it is only valid until updated or times out?)