procount / pinn

An enhanced Operating System installer for the Raspberry Pi
1.12k stars 123 forks source link

RPi5 w/PCIE NVME doesn't boot Ubuntu 24.04 or Bookworm64 #827

Closed RanchoHam closed 2 months ago

RanchoHam commented 3 months ago

Describe the bug Ubuntu 24.04 repeatedly panics on boot. Ubuntu 23.10 will install and boot; but upon reboot after distribution upgrade, the kernel panics. Bookworm64 gets stuck on the splash screen. Bookworm32 will install and operate.

To reproduce The following presumes that a bootable OS is already present on the NVME drive. If starting with a bare drive, skip to step 3.

  1. Boot RPi5 without NVME installed into network RPi Imager.
  2. Install Bookworm64 on USB drive and boot from USB drive. Use sudo rpi-eeprom-config --edit to change the BOOT_ORDER from 0xf416 to 0xf414. Shutdown.
  3. Install NVME drive, and reboot without USB or SDcard. Use the network RPi Imager to install Ubuntu 23.10, Ubuntu 24.04, Bookworm64, and Bookworm32. Shutdown.
  4. Boot from USB drive created in step 2. Use sudo rpi-eeprom-config --edit to change the BOOT_ORDER to 0xf416. Shutdown.
  5. Boot and select one of the 4 available OSs. Two will fail and two will succeed.

Expected behaviour All four OSs will boot without issue.

Actual behaviour Ubuntu 24.04 will kernel panic. Bookworm64 will hang at the splash screen. Bookworm32 will operate normally. Ubuntu 23.10 will work until a distribution update is done. At which point it is effectively Ubuntu 24.04, and it kernel panics on boot.

System Add answers to the following questions:

Logs None recovered.

Additional context None.

RanchoHam commented 3 months ago

Additional error message showed up upon next retry of installing the above 4 OSs: OS: 'ubutu2404' needs a partition label of 'system-boot' which is not available

procount commented 3 months ago

There have been documented issues with Ubuntu 24.04, even without multi-boot, so we may have to wait for them to be fixed. What do you mean by Bookworm64 and Bookworm32? 'Bookworm' is a version name for debian based systems, including Debian itself, Raspberry Pi OS and Ubuntu. It is not a Linux Distribution. Should I assume you mean "Raspberry Pi OS" for 32 or 64-bit systems? But there are 3 separate distributions provided: full, standard and Lite. Which do you mean?

I would advise to use a boot_order of F461 rather than f416, so that you can always recover any issues with your NVME by inserting an SD card.

I assume you have added the necessary eeprom_config options to enable NVME?

RanchoHam commented 2 months ago
  1. Bookworm(64 or 32) was short-hand for the standard desktop (64-bit or 32-bit) Raspberry Pi OS Bookworm version distributed by the RPi Network Imager.
  2. Thanks for the f461 tip. I forgot that swapping the digits, swapped the order.
  3. Yes, I have enabled the PCIE NVME options; and PINN is installed and booting from the NVME drive. Hardware is Argon One V3 case with built-in PCIE NVME adapter and a Kinsgston NV2 NVME stick. Bookworm64 installed, booted, and ran quite well on bare-metal using the RPi Network Imager.

Forgive me, but I have forgotten the method to extract and forward the PINN installation logs.

procount commented 2 months ago

Bookworm(64 or 32) was short-hand for the standard desktop (64-bit or 32-bit) Raspberry Pi OS Bookworm version distributed by the RPi Network Imager.

ok, but full, std or lite version? Probably won't help much anyway, but will help me to duplicate your problem.

Hardware is Argon One V3 case with built-in PCIE NVME adapter

That's what I have but with a Pinedrive 256GB NVMe SSD (2242).

Forgive me, but I have forgotten the method to extract and forward the PINN installation logs.

See https://github.com/procount/pinn/wiki/Troubleshooting

Currently I have used PINN to install Raspios_arm64_full & raspios_arm64_lite, Ubuntu23.10 and kde3lee64. No problems with any of these. Given there are known problems with Ubuntu24.04 on NVME, the only one I would be concerned about is "bookworm64". Raspios is the OS I would least expect to fail. Please describe "Hang at the Splash screen" - Is the splash screen continuously displayed, or does the screen go black?

RanchoHam commented 2 months ago

Bookworm(64 or 32) was short-hand for the standard desktop (64-bit or 32-bit) Raspberry Pi OS Bookworm version distributed by the RPi Network Imager.

ok, but full, std or lite version? Probably won't help much anyway, but will help me to duplicate your problem.

std version

I have included a photo of the Bookworm64 splash screen that it hangs at. It does not screen save off (at least for the 15 min or so that I have been writing this). Bookworm64freeze

procount commented 2 months ago

Sorry, I missed the standard part.

No idea why bookworm64 has frozen for you. I guess you have already tried reinstalling it? I have made a hybrid version of Ubuntu 24.04 that seems to work on NVME though. I will push up a conversion soon.

RanchoHam commented 2 months ago

Yes, I have tried reinstalling multiple times. I will try installing the full Bookworm version.

At the ,moment I'm trying to off-load the debug and dmesg files to a USB attached thumb drive. It seems to get mounted to the /mnt directory as ro. I tried the mount -o remount,rw /mnt command, but the write failed. I will try with a new thumb drive SD card.

procount commented 2 months ago

The debug and dmesg files will help debug the installation of Raspios, which looks like it worked. However, it will provide information about the layout of your disks, so it might still help, although probably not much.

The drive you are booting PINN from is mounted on /mnt, so your USB stick will only be on /mnt if you booted from it. If you have booted from SD card, you will have to manually mount the USB stick, unless it has been automounted at /tmp/media/sda1 if it looked like a PINN drive with a /os folder.

I'm not sure how to debug hanging on the splash screen. You might have some logs on your raspios root drive in /var/log which you could review (maybe messages.log)

RanchoHam commented 2 months ago

Since I can't easily get to the SD card in the Argon One cases, I changed the EEPROM boot order to f64; and I can now boot from a thumb drive or NVME drive.

Something else has changed: I can't get Bookworm32 or 64 to install as both now hang at the splash screen.

When you do a Bookworm install, does yours go from the splash screen to the white background configuration or does it do the configuration first before first boot? I'm speculating here that there is some cruft left-over after partition wipe which is trying to bypass the normal (annoying as it is) configuration screen.

RanchoHam commented 2 months ago

OK, more information. I tried replacing raspios_arm64_full with raspios_arm64_lite. It got to the point of expanding the filesystem to fill the partition and popped the following error: Begin: Running /scripts/local-premount , , , Begin: Resizing root filesystem...\n\nDepending on storage size and speed, this may take a while. ... Error: Can't have a partition outside the disk! Ignore/Cancel?

procount commented 2 months ago

Hmm, interesting. What size is your NVME? It could be the Raspios64 std version is failing at the same point of partition expansion, but we can't see it because the log messages are suppressed. (You could try enabling the screen log messages. I think you'd have to remove 'quiet' from cmdline.txt on Rapsios64, after installing it but before booting it.) Actually, the partition expansion script is pointless under PINN because the partition size is pre-determined and there is no room for expansion. Sometimes I remove it, but it doesn't normally cause a problem. It sounds like there may be something wrong with your partitioning. If you can manage to get the logs from the troubleshooting, it might help.

When you do a Bookworm install, does yours go from the splash screen to the white background configuration or does it do the configuration first before first boot?

The install should be the same under PINN as a standard install. IIRC, Raspios shows the splash screen whilst it tries to resize the root partition. Then it will reboot (actually Raspios is clever enough to reboot into itself, so it doesn't go through the PINN recovery screen at that point.) Next it will show the white configuration screen where you enter your configuration, and then I think it may reboot fully again through PINN.

I'm speculating here that there is some cruft left-over after partition wipe which is trying to bypass the normal (annoying as it is) configuration screen.

Sounds like the partition resize script is failing, so it's not bypassing the configuration, it's just not actually getting to it. Hence I feel something is wrong with your NVME partitioning.

If all else fails, and you haven't installed much, you could set the runinstaller option in PINN to wipe everything and try installing your OSes again. Don't bother with Ubuntu24.04, even my hybrid version didn't run on the 2nd reboot on NVME.

RanchoHam commented 2 months ago

OK, I agree that it is never getting past the resize script. The NVME drive is a Kingston NV2 1TB M.2 2280 NVMe from Amazon.

OK, I'll skip the Ubuntu 24.04. Have you tried Ubuntu 22 LTE lately?

procount commented 2 months ago

I think Ubuntu 23.10 was the first version to run on a PI5, and I even had to copy the initrd from 24.04 to get that to run on the NVME.

RanchoHam commented 2 months ago

I received the necessary parts to put together another RPi5 in an Argon One w/PCIE NVME (2TB) so here are the results with a bare metal install: std Bookworm64 still hangs (log file included as log std bookworm 64), Ubuntu 23 installed from backup just fine, Bookworm64 lite hangs (log file included as log B64lite). log std bookworm 64.zip log B64lite.zip

procount commented 2 months ago

What settings have you got in rpi_eeprom_config? Which power supply are you using - the official 27W pi5 version?

RanchoHam commented 2 months ago

Sorry for the delay.

The contents of rpi-eeprom-config: [all] PSU_MAX_CURRENT=5000 WAKE_ON_GPIO=0 PCIE_PROBE=1 BOOT_UART=1 POWER_OFF_ON_HALT=1 BOOT_ORDER=0xf416

I have used both the official 27W and an Argon One 27W.

procount commented 2 months ago

I will be busy for a while, but I'll retry installing everything from scratch when I can like your setup and see what happens

procount commented 2 months ago

I found a bit of spare time so I tried installing everything from scratch. This is the contents of my rpi-eeprom-config:

[all] PSU_MAX_CURRENT=5000 BOOT_UART=1 POWER_OFF_ON_HALT=0 BOOT_ORDER=0xf461 PCIE_PROBE=1

  1. I wiped the SSD by setting runinstaller and rebooting to wipe all the extended partitions, then dropped into the PINN shell to remove the p1,p2 and p5 partitions. i had no SD card or USB drives attached.
  2. Rebooting took me to the netinstall screen. I held shift to download the rpi-imager.
  3. I installed PINN 3.9.2 to the NVME/SSD drive and rebooted.
  4. I selected raspios_arm64_full, raspios_arm64, raspios_arm64_lite and ubuntu23.10 and installed them.
  5. I booted each one in turn to configure them, rebooting into them each twice more to make sure they were stable.
  6. I had no issues with any of them.
RanchoHam commented 2 months ago
  1. I first set the rpi-eeprom-config to your exact settings & rebooted.
  2. Deselected all OSs, clicked Install & rebooted.
  3. Dropped into PINN shell, unmounted /mnt, ran parted, removed all partitions, quit parted, and switched back to graphical interfade.
  4. Rebooted. Was presented with opportunity to run network Imager, so held shift down.
  5. Installed PINN from network Imager & rebooted.
  6. Selected raspios_arm64_full, raspios_arm64, raspios_arm64_lite and ubuntu23.10 and installed them.
  7. raspios_arm64_full, raspios_arm64 and ubuntu23.10 installed and tested with no issue. Hoorah!!!!
  8. However, raspios_arm64_lite failed first boot with the attached screen (see jpg file) with 4 long and 5 short flashes.
  9. Checked current rpi-eeprom-config (with std Bookworm64) which now reads [all] PSU_MAX_CURRENT=5000 WAKE_ON_GPIO=0 PCIE_PROBE=1 BOOT_UART=1 POWER_OFF_ON_HALT=1 BOOT_ORDER=0xf416 I did not make the changes!
  10. Tried rebooting with the same result.
  11. Changed to your settings and tried rebooting raspios_arm64_lite with the same result.
  12. Tried reinstalling just the raspios_arm64_lite. Booted it with the same result.
  13. Tried replacing raspios_arm64_lite with raspios_armhf_lite. Booted it with the same result.
  14. Backed up 3 working Oss to 1TB sd card. (Note: the sd card USB adapter does make a difference! One would not complete backups, but a different holder with the same sd completed just fine.) Replaced raspios_armhf_lite with backup of raspios_arm64 with same bad result.
  15. Trying raspios_arm64 backup over itself. Did not work. Beginning to think the backups are all bad, in-spite of the self-report of success. Falling back to regroup and rethink path forward.

From: procount @.> Sent: Friday, July 19, 2024 4:13 PM To: procount/pinn @.> Cc: Rich McDonald @.>; Author @.> Subject: Re: [procount/pinn] RPi5 w/PCIE NVME doesn't boot Ubuntu 24.04 or Bookworm64 (Issue #827)

I found a bit of spare time so I tried installing everything from scratch. This is the contents of my rpi-eeprom-config:

[all] PSU_MAX_CURRENT=5000 BOOT_UART=1 POWER_OFF_ON_HALT=0 BOOT_ORDER=0xf461 PCIE_PROBE=1

  1. I wiped the SSD by setting runinstaller and rebooting to wipe all the extended partitions, then dropped into the PINN shell to remove the p1,p2 and p5 partitions. i had no SD card or USB drives attached.
  2. Rebooting took me to the netinstall screen. I held shift to download the rpi-imager.
  3. I installed PINN 3.9.2 to the NVME/SSD drive and rebooted.
  4. I selected raspios_arm64_full, raspios_arm64, raspios_arm64_lite and ubuntu23.10 and installed them.
  5. I booted each one in turn to configure them, rebooting into them each twice more to make sure they were stable.
  6. I had no issues with any of them.

— Reply to this email directly, view it on GitHubhttps://github.com/procount/pinn/issues/827#issuecomment-2240584693, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMAPQCRRULCEX644MIR6ZE3ZNGMQ7AVCNFSM6AAAAABKQBKOIWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBQGU4DINRZGM. You are receiving this because you authored the thread.Message ID: @.***>

RanchoHam commented 2 months ago

OK, now I'm really at a loss. I set aside one RPi5/Argon One/2TB NVME and grabbed the one I started this with: RPi5/Argone One/1TB NVME. This one I wiped before I set it aside.

  1. Network installed PINN.
  2. Choose the usual 4 OSs, but added 4 Project Spaces, Data Space, and Swap Space expanded to 8GB.
  3. Installed all. Successfully booted raspios_arm64, checked rpi-eeprom-update, and saw the version was from late 2023. Somewhere in the process of getting the raspios_arm64 upgraded to latest, the eeprom was also updated. Just not sure when or how. Rebooted a couple of times.
  4. Booted ubuntu 23, raspios_arm64_full, and raspios_arm64_lite. All successfully installed and booted multiple times.
  5. Will try backing up all OSs tomorrow.

If I can do backup and restore, I will try the 2TB system again.

procount commented 2 months ago

Using the most up to date boot loader will often help as it is being improved on all the time. You can also look at https://forums.raspberrypi.com/viewtopic.php?t=369760 for the latest news on installing Ubuntu 24.04.

RanchoHam commented 2 months ago

One last comment: I replaced my Crucial P3 Plus 2TB PCIe Gen4 3D NAND NVMe M.2 SSD with a Kingston NV2 1TB M.2 2280 NVMe SSD and was able to get raspios_arm64_full, raspios_arm64, raspios_arm64_lite and ubuntu23.10 working including backup and restores. I was also able to expand the data partition to 3GB using the partition table adjustment feature.

I suspect it has something to do with the 2TB SSD media. The debug error messages indicated a partition outside of the allowed values.

So, with a known work-around or limitation, I consider this closed.

procount commented 2 months ago

2TB is on the limit of MBR compatibilty. Any bigger and you would need to use GPT, but PINN does mot support that. Perhaps your ssd is just a bit too big?

You could try adding an option provision=500 to cmdline.txt which will reduce the usable size of your ssd by 500MB. If that works, try reducing it til you find the smallest value that still works. Or try increasing it if it didn't work.

RanchoHam commented 2 months ago

Could be. I was able to install both Bookworm 64 desktop and Ubuntu 24 directly on the 2TB drive.

If I ever switch back to the 2TB drive, I might try the provision=500 option. For now though, I have some other things vying for my attention.

Cheers, Rich

From: procount @.> Sent: Friday, July 26, 2024 3:06 PM To: procount/pinn @.> Cc: Rich McDonald @.>; State change @.> Subject: Re: [procount/pinn] RPi5 w/PCIE NVME doesn't boot Ubuntu 24.04 or Bookworm64 (Issue #827)

2TB is on the limit of MBR compatibilty. Any bigger and you would need to use GPT, but PINN does mot support that. Perhaps your ssd is just a bit too big?

You could try adding an option provision=500 to cmdline.txt which will reduce the usable size of your ssd by 500MB. If that works, try reducing it til you find the smallest value that still works. Or try increasing it if it didn't work.

— Reply to this email directly, view it on GitHubhttps://github.com/procount/pinn/issues/827#issuecomment-2253570639, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMAPQCUHF6NPI2QZ6AEHHR3ZOLB6LAVCNFSM6AAAAABKQBKOIWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJTGU3TANRTHE. You are receiving this because you modified the open/close state.Message ID: @.***>