zbm-dev / zfsbootmenu

ZFS Bootloader for root-on-ZFS systems with support for snapshots and native full disk encryption
https://zfsbootmenu.org
MIT License
870 stars 66 forks source link

`//lib/dracut/hooks/cmdline/95-zfsbootmenu-parse-commandline.sh: Syntax error: "(" unexpected` when booting #335

Closed lediur closed 2 years ago

lediur commented 2 years ago

I recently set up a Debian system with zfsbootmenu and rEFInd using the instructions in the wiki. After a few days, I updated the system from bullseye with zfs 2.0 to testing with zfs 2.1.5. The next time I booted, zfsbootmenu presented a warning and said I should rerun generate-zbm, but still allowed me to boot into Debian. In my Debian testing install, I reran generate-zbm and rebooted.

Now, if I try to use rEFInd to boot the most recently generated entry, the boot process fails with the error:

[ ... ] Run /init as init process 
/init: 5: //lib/dracut/hooks/cmdline/95-zfsbootmenu-parse-commandline.sh: Syntax error: "(" unexpected
dracut Warning: Signal caught!

signal-2022-09-09-191305

It drops to the emergency shell, but the shell is unresponsive to USB keyboard input.

I've tried adding init=/bin/bash to the kernel arguments with no change.

I'm not very familiar with dracut or the surrounding boot process, but I did a bit of digging nevertheless. The error seems to suggest that the script isn't getting run as a bash script, but by sh or dash? This appears to be line 5 of the affected file.

Any idea how I can recover my ZBM setup? I've tried fiddling a bit with a chrooted live USB, but I'm not really sure what I need to fix in order to execute this script properly short of rewriting the hook to not use Bash arrays and rebuilding.

zdykstra commented 2 years ago

When you're on the rEFInd screen and you hit F2, do you have access to previous versions of ZFSBootMenu? If you do, you can simply boot one of those for the time being. The warning to upgrade your version of ZFS in ZFSBootMenu isn't fatal; it'll just prevent you from doing any read-write operations inside ZBM. You can continue to boot normally.

If you don't have any previous versions available that can be selected by rEFInd, your next best option is to download the ZFSBootMenu 2.0.0 release EFI - it ships with ZFS 2.1.5 and is capable of booting any system/pool. It can be placed on a USB drive with a vfat partition and rEFInd should find it and let you boot from it.

As far as the error you're seeing, bash is a hard requirement of ZFSBootMenu. Debian/Dracut must be getting confused in some fashion and replacing /bin/bash with dash. This will require some investigation on our end to determine how, exactly, that can happen.

ahesford commented 2 years ago

I'm not sure why this would present any problem; the command-line parser in question works on several test platforms and several real systems with both dracut and mkinitcpio. I'll see if I can reproduce with a Debian testbed (I'm pretty sure I tested a Debian image before cutting the release, but I'll try again.)

In the meantime, I see zdykstra has already offered the advice that I was going to give.

lediur commented 2 years ago

Hey, thanks for the quick replies! I was also very confused by the error, since I was successfully booting using ZBM for several days beforehand. I think the generated initramfs / vmlinuz that worked was probably built during the setup process with bullseye, not from testing though.

Unfortunately I only get the one "Boot with default options" menu entry when I press F2 on the ZBM entry in rEFInd. This is probably my fault - when I originally installed rEFInd as part of the setup process, it ended up mounting the ESPs of two drives and dropping files across both, which caused some confusion and false starts. I recently tried to consolidate by moving the older ZBM initramfs / vmlinuz files to the correct drive, but those older loaders don't seem to register in rEFInd anymore.

I'll try the EFI loader on the USB drive next, and I'll also try to get the older loader working again.

lediur commented 2 years ago

Ah okay I found a typo in my refind.conf :sweat_smile: so I'm able to boot with the old loader. Let me know if there's any testing you'd like me to try out.

ahesford commented 2 years ago

I misread your original message about moving to zfs 2.1.5 and overlooked that you moved your whole system to the Debian testing branch. We don't have a testbed to stand up that mimics this.

My suspicion is that you are somehow missing bash and /bin/bash is a symlink (maybe an alternative gone wrong?) to /bin/sh, which is dash. I wouldn't think the alternatives system would allow dash to stand in for bash, but I can't imagine another scenario where the shell in your initramfs rejects valid bash array syntax and the source keyword. ZFSBootMenu (and dracut itself) explicitly demand bash; if it were missing, the generation would fail outright.

In the meantime, I'd recommend sticking with our prebuilt EFI if that configuration is suitable for your needs. If you need custom features, you can build in a container to use a supported Linux environment with ZFS 2.1.5.

Sithuk commented 2 years ago

The issue is present on Ubuntu 22.10 too.

ahesford commented 2 years ago

Not in my testing. Performing a simple s/jammy/kinetic/g in every *ubuntu* file (or its link target) in https://github.com/zbm-dev/zfsbootmenu/tree/master/testing/helpers and running our testing setup script produces a 22.10 virtual machine image and corresponding ZBM images that can boot the VM just fine.

This is a misconfiguration on the host, maybe caused by the upgrade process itself.

Sithuk commented 2 years ago

I just tried a fresh install of Ubuntu 22.10 using the Ubuntu Zfsbootmenu script and experienced the same error.

@lediur : How are you installing Debian? Were you able to fix the issue?

ahesford commented 2 years ago

Try adding add_dracutmodules+=" bash " to a *.conf file in /etc/zfsbootmenu/dracut.conf.d/.

lediur commented 2 years ago

Hey, since I still had the old bootloader in rEFInd I've been using that and haven't tried to resolve the issue since then.

I think I remember looking at /bin/bash and it wasn't a symlink to /bin/sh, but hadn't dug any deeper.

I'll try generating a new one with the conf change you just mentioned @ahesford

derrick@monolith ~/Downloads [2]> /bin/bash --version
GNU bash, version 5.2.0(1)-rc2 (x86_64-pc-linux-gnu)
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

@Sithuk: I followed the steps in the wiki and then later upgraded Debian to testing by changing my apt sources. I had generated the old bootloader entry while still in bullseye.

ahesford commented 2 years ago

I think the problem is that your Debian or Ubuntu configurations are installing dash at /bin/sh in the initramfs and the command-line parsing of ZFSBootMenu is sourced by /init, which has a shebang of /bin/sh. The bash dracut module forces /bin/sh to point to bash. If adding this module fixes the problem, we may just add that to the standard ZBM dracut config to force the issue.

Sithuk commented 2 years ago

Try adding add_dracutmodules+=" bash " to a *.conf file in /etc/zfsbootmenu/dracut.conf.d/.

Perfect. That fixed it. Thank you.

I added bash to the existing add_dracutmodules line in /etc/zfsbootmenu/dracut.conf.d/zfsbootmenu.conf as follows.

add_dracutmodules+=" zfsbootmenu bash "

Then ran "generate-zbm".

Zfsbootmenu started as expected during boot.

I also went back to my 22.04 to 22.10 upgrade VM and did the same. That also fixed the zfsbootmenu boot issue in that VM too.

Thank you.