fedora-iot / iot-distro

Issue tracking for the Fedora IoT Edition
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

Upgrade from f40 to f41 failed, new system cannot boot. #56

Closed mxj4 closed 1 week ago

mxj4 commented 3 weeks ago

Describe the bug

I rebased from stable (f40) to devel (f41), on the reboot I got systemd service failure related to root filesystem not properly mounted in the startup process. The error was: Aug 22 00:52:29 localhost (ab-check)[1483]: initrd-parse-etc.service: Failed at step EXEC spawning /usr/lib/systemd/systemd-sysroot-fstab-check: No such file or directory The full systemd log is init.txt The file /usr/lib/systemd/systemd-sysroot-fstab-check was indeed not there, but the file was available under /sysroot. I'm not sure if initrd-parse-etc.service was running at the right time.

To Reproduce

I may not know the exact steps to reproduce the scenario, due to my lack of knowledge to ostree/rpm-ostree. I did lots of tweaks around initramfs, then reverted some, and not 100% sure if tweaks were done correctly, or if those reverts were actually reverted.

  1. install iot 40 (version around 20240801)
  2. install dracut-sshd, I use this to type in luks passphrase remotely
    sudo curl https://copr.fedorainfracloud.org/coprs/gsauthof/dracut-sshd/repo/fedora-40/gsauthof-dracut-sshd-fedora-40.repo -o /etc/yum.repos.d/gsauthof-dracut-sshd.repo
    sudo rpm-ostree install dracut-sshd systemd-networkd
    sudo reboot
  3. then I did lots of changes in /etc/dracut-sshd and /etc/systemd/network
  4. then I tried tweaking luks flags, by changing /etc/crypttab, then I found this file is not read during system startup, so I tried some rpm-ostree commands to make files visible to initramfs, I've tried following commands multiple times and I've forgot the order I proceeded:
    rpm-ostree initramfs --enable --arg=-I --arg=/etc/crypttab
    sudo rpm-ostree initramfs-etc --track /etc/crypttab
  5. then I found I don't really need this, I can just rpm-ostree kargs --append, then I tried to revert those initramfs-related changes, I forgot what commands I used to revert those initramfs changes, and I'm not sure if those changes are actually reverted, and I'm not sure if any of those change mattered to the startup error I see later after upgrade to devel branch.
  6. Now rebase to iot devel branch (version around 20240821)
  7. Now reboot, then ssh to initramfs to unlock luks2 encrypted root partition, then I noticed the startup process failed and I'm in emergency mode.
  8. Then I checked journalctl and found the failure.

Expected behavior Reboot after rebase should work, regardless of my tweaks around initramfs.

Screenshots If applicable, add screenshots to help explain your problem.

OS version:

Please replace this line with output of `rpm-ostree status -b`

Additional context Add any other context about the problem here.

nullr0ute commented 3 weeks ago

Does it work if you upgrade to f41 before you do all the dracut-ssh and "lots of changes"? We need a defined reproducer here as describing things by "lots of changes" makes it extremely hard for us to reproduce to be able to fix the problem.

pcdubs commented 1 week ago

How was this system installed originally? I think you may be hitting - https://bugzilla.redhat.com/show_bug.cgi?id=2305291 which affects installations from anaconda/iso and not the pre-generated disk images.

pcdubs commented 1 week ago

PR's to fix - f41 rawhide

Verified upgrades from F40->F41 working, OpenQA rebase tests now passing - https://openqa.fedoraproject.org/tests/2853335

pcdubs commented 1 week ago

This should now be fixed, if your issue persists, please reopen this with a reproducer for us to look at.