Closed ximGBu4cyQss5P closed 9 months ago
RAID0 doesn't matter; If OpenZFS can import it, ZFSBootMenu can boot it. How many zpools does this system have? If it's more than one, and bootfs
is set for zroot
, setting zbm.prefer=zroot
on the ZFSBootMenu command line via zbm-kcl
should resolve the issue.
Thanks. There's 2 pools on that machine. The RAID0 NVMe one I created for ZBM, and a 4 disk RAID10 for data storage. I added zbm.prefer=zroot
to the default command line but I'm still seeing the same result where I get dropped into ZBM instead of it automatically booting zroot/ROOT/tumbleweed
which is set as bootfs for that pool.
Where exactly did you add zbm.prefer=zroot
? If you added it to org.zfsbootmenu:commandline
on a pool, that won't do anything. It needs to be added to the kernel command line for the EFI bundle that you're booting, via zbm-kcl -e /path/to/your/efi
.
I had updated the command line for the ZBM EFI bundle, so it now reads quiet loglevel=0 nomodeset zbm.prefer=zroot
. I thought maybe the system was actually finding ZBM on nvme2n1 instead of nvme1n1 since the EFI partition is duplicated to both. But I updated those command lines too with zbm-kcl, so now all 4 possible EFI bundles (main and backup on both disks) have the same command line. But I'm still getting the same results. The weird repeated ^[[26~
instead of the countdown, then after my password I get dropped into ZBM.
Can you upload a copy of the EFI bundle that you're using to boot your system? Can you also share the output of zpool get bootfs
from your booted machine?
Sure thing. Here's the output of zpool get bootfs
.
NAME PROPERTY VALUE SOURCE HDD bootfs - default zroot bootfs zroot/ROOT/tumbleweed local
Is there a specific place you want the EFI bundle uploaded? It looks like the max here is 25MB.
Dropbox or a Google files account would work. Otherwise you can make a repository on GitHub and upload it there.
Can you also post the output of zreport
? A photo of the console is fine if you don't want to jump through hoops to capture the text.
Sure thing, here you go...
MD5 for the EFI bundle: 93553b9d07d8eed4c85edead5f7abb0b
Thank you for uploading both of those! I've retrieved both of the files.
The zreport data looks fine at first glance. Do you connect to ZBM through an iDRAC/IPMI interface, or is it directly through a local display/keyboard?
If you're willing to reboot this system again, you could try setting 'zbm.skip' on the EFI's KCL. That'll completely bypass the countdown screen and should result in it directly booting your bootfs
value on the preferred pool. Obviously this isn't a permanent solution, but it can possibly help us narrow down what might be happening here.
I finally figured this out, and the culprit is interesting and surprising to me. Your question piqued my interest. While this particular machine doesn't have an IPMI interface, last night I began suspecting the problem may have something to do with the HDMI KVM that it's hooked up to since it seems like ZBM is getting some kind of spurious input. After many many reboots today testing different cabling configurations, it turns out the KVM isn't to blame. It was the Logitech H820e headset. It seems like ZBM is getting some kind of USB input from this device, and it happens even without the KVM involved and the headset hooked up directly to the computer. I thought it may be some weird chipset interaction quirk on this old dual Xeon Z840 workstation, but the exact same thing happens on a much newer AMD Zen 3 laptop where I have a minimal Xubutnu install with ZBM. I also happen to have a second one of these Logitech H820e headsets in the house in my wife's office so I tested that one too. Same result, so it's not just some weird defect in mine. Short of the "throw the headset in the trash and buy another one" solution, is there any way to blacklist / block certain devices by USB device ID within the ZBM EFI bundle similar to how I might do it with udev rules in a fully booted system? Also, zbm.prefer=zroot
isn't needed in the end once the problematic device is removed, ZBM does the right thing with the 2 pool setup so that's another positive.
There is no way for you to ignore this device in a pre-built release image, but you can build a custom image with whatever udev rules or driver blacklists are appropriate.
I'm glad you were able to track down what was happening! I suspected that something was giving you spurious input, interferring with the countdown screen. I would not have guessed in a million years that it was a USB headset!
ZFSBootMenu itself doesn't have a mechanism to blacklist a USB device. If you build a custom image, you can add the appropriate udev rules file to it. https://docs.zfsbootmenu.org/en/v2.3.x/guides/general/container-building.html is probably the best option for you - building on top of openSuSE could be tough.
ZFSBootMenu build source
Release EFI
ZFSBootMenu version
2.3.0
Boot environment distribution
OpenSUSE Tumbleweed
Problem description
First of all, great work on an awesome project! I've previously set up ZBM on a single disk on my laptop, and a multiboot with shared keys in a VM just to get somewhat familiar with the process. I ran into an issue trying to use a RAID0 pool, and I assume it's more likely that I've just done something wrong than an issue with ZBM.
I installed Tumbleweed on a RAID0 pool, and the usual install stuff went fine. But on the first reboot and boot into that pool, I didn't see the usual ZBM count down, and instead got a few repeating sequences of ^[[26~, then a brief splash of what looked like the ZBM countdown screen. Then I get the usual password prompt for zroot. Once I enter my password, my default (and only) BE doesn't boot, I get dropped into ZBM. Here, the first time I hit ENTER to boot my BE, I got dropped into the chroot for it. I assumed things were borked, but the pool and datasets looked fine, /boot looked good, the ZFS properties seemed to be set okay, and I even scrubbed the pool for fun with no errors. I rebooted and got dropped into ZBM instead of the BE booting again, but this time when I hit ENTER the BE booted fine. And that's been what's happening ever since.
The only real differences to anything I've done before that seem important are...
Are there any gotchas or tips to be aware of for RAID0 pools?
Steps to reproduce
Here's the steps I took. This was done from a real Tumbleweed host on /dev/nvme0n1 that already uses ZFS for non-root stuff. Kernel 6.6.6, ZFS 2.2.2, ZBM 2.3.0.