zbm-dev / zfsbootmenu

ZFS Bootloader for root-on-ZFS systems with support for snapshots and native full disk encryption
https://zfsbootmenu.org
MIT License
841 stars 65 forks source link

zfsbootmenu UEFI loads and boots kernel, but Linux is in legacy mode #681

Closed SumOys closed 1 week ago

SumOys commented 1 week ago

ZFSBootMenu build source

Release EFI

ZFSBootMenu version

zfsbootmenu-release-x86_64-v2.3.0-vmlinuz.EFI

Boot environment distribution

Gentoo Linux

Problem description

I'm using the prebuilt binary for ZFSBootMenu on Gentoo (from https://get.zfsbootmenu.org/efi). Previously I was using GRUB + LUKS + btrfs. I followed the directions (mostly going by the Void Linux docs) to get it running on Gentoo.

I can successfully boot 6.6.52-gentoo from a ZFS volume that has native encryption. ZFSBootMenu unlocked the volume, shows me a kernel list and boots the kernel with its dracut initial ramfs. The trouble is after I boot, if I run efibootmgr I get EFI variables are not supported on this system.

Huh, so maybe the efi vars file system didn't get mounted.

root@ellie /h# mount none /sys/firmware/efi/efivars -t efivarfs
mount: /sys/firmware/efi/efivars: mount point does not exist.
       dmesg(1) may have more information after failed mount system call.

Well this is weird. CMS/Legacy mode is disabled entirely in my BIOS. It's set to UEFI only. I booted ZFSBootMenu via my motherboards built in UEFI shell:

image

But I can't change any UEFI boot parameters from within Linux. I can change them from ZFSBootMenu:

image

I know it looks a bit weird, but I use two disks and RAID1 (with 1.0 metadata) for the ESP. So I have both of the mirrors of the ESP listed for primary/backup. It's like this diagram, but replace the RAID0 block with my ZFS volume (just spread across both drives, stripped, no redundancy).

image

So right now everything is technically fine. I can boot. If I go into ZFSBootMenu, I can get to efibootmgr and adjust things. But I still want to know, how am I not in UEFI mode when booted into Linux? I'm missing the efivarfs mount point in /sys and if I try to mount it anywhere else, I get an Operation not supported with nothing in dmesg

mount none /tmp/a -t efivarfs
mount: /tmp/a: mount(2) system call failed: Operation not supported.
       dmesg(1) may have more information after failed mount system call.

So what exactly is going on here? Is ZFSBootMenu (which is run in UEFI mode) somehow booting my kernel in legacy BIOS mode? Or is there some other weird reason I can't access UEFI variables from my system?

Steps to reproduce

I'm going to make a tutorial at some point, similar to this one, so I tracked all my commands:

New Drives

zpool create -f -o ashift=12 \
 -O compression=lz4 \
 -O acltype=posixacl \
 -O xattr=sa \
 -O relatime=on \
 -O encryption=aes-256-gcm \
 -O keylocation=file:///etc/zfs/zroot.key \
 -O keyformat=passphrase \
 -o autotrim=on \
 -m none zroot /dev/nvme0n1p3 /dev/nvme2n1p3
zfs create -o mountpoint=none zroot/ROOT
zfs create -o mountpoint=/ -o canmount=noauto zroot/ROOT/gentoo
zfs create -o mountpoint=/home zroot/home
zfs create -o mountpoint=/var zroot/var
zpool set bootfs=zroot/ROOT/gentoo zroot
mkdir /newroot
zpool import -N -R /newroot zroot
zfs load-key -L prompt zroot
zfs mount zroot/ROOT/gentoo
zfs mount zroot/home
zfs mount zroot/var
# mount | grep zfs
zroot/ROOT/gentoo on /newroot type zfs (rw,relatime,xattr,posixacl,casesensitive)
zroot/home on /newroot/home type zfs (rw,relatime,xattr,posixacl,casesensitive)
zroot/var on /newroot/var type zfs (rw,relatime,xattr,posixacl,casesensitive)
# zfs list -t all
NAME                USED  AVAIL  REFER  MOUNTPOINT
zroot              1.89M  3.39T   192K  none
zroot/ROOT          432K  3.39T   192K  none
zroot/ROOT/gentoo   240K  3.39T   240K  /newroot
zroot/home          192K  3.39T   192K  /newroot/home
zroot/var           192K  3.39T   192K  /newroot/var
udevadm trigger
mdadm --create --verbose /dev/md2 --level=mirror --metadata 1.0 --raid-devices=2 /dev/nvme0n1p1 /dev/nvme2n1p1

mkfs.fat -F32 /dev/md2

mount /dev/md2 /newroot/boot/efi/
cd /newroot
mkdir -p dev media mnt proc run sys tmp root
chmod 777 tmp
chmod 700 root
rsync -avxHAWXS --numeric-ids /{bin,boot,etc,lib*,opt,sbin,usr,var,home} .
zfs set org.zfsbootmenu:keysource="zroot/ROOT/gentoo" zroot
zfs set org.zfsbootmenu:commandline="radeon.audio=1 ixgbe.allow_unsupported_sfp=1" zroot/ROOT
rc-update add zfs-mount boot
zdykstra commented 1 week ago

Is your kernel missing CONFIG_EFI=y ?

SumOys commented 1 week ago

Is your kernel missing CONFIG_EFI=y ?

No. Here is the current running kernel configuration taken directly from /proc/config.gz (and ungzipped).

config.txt

ahesford commented 1 week ago

What's the output of

dmesg | grep -i EFI

from your boot environment?

SumOys commented 1 week ago

When booted:

❯ dmesg | grep -i efi
[    0.003088] ACPI: UEFI 0x00000000B383B000 000048 (v01 ALASKA A M I    01072009 AMI  01000013)
[    0.003111] ACPI: Reserving UEFI table memory at [mem 0xb383b000-0xb383b047]
[    0.044085] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[    0.533740] pci 0000:16:00.0: BAR 0: assigned to efifb
[    1.624471] tsc: Refined TSC clocksource calibration: 3699.957 MHz

So yes, I'm booted into legacy mode. From within ZFSBootMenu, it's booting correctly in EFI mode:

image

very weird.

SumOys commented 1 week ago

Well, very odd. I switched out to the Gentoo pre-configured distribution kernel: 6.6.51-gentoo-dist, and now:

sudo efibootmgr
BootCurrent: 0000
Timeout: 1 seconds
BootOrder: 0000,0001,0002,0003,0004
Boot0000* ZFSBootMenu_0 HD(1,GPT,cbbe0cbe-5797-4253-b4d6-4a8e91df89e1,0x800,0x7a000)/File(\EFI\zbm\VMLINUZ.EFI)
Boot0001* ZFSBootMenu_1 HD(1,GPT,b22081e9-f981-4ff7-89c7-11000eb1e3d4,0x800,0x7a000)/File(\EFI\zbm\VMLINUZ.EFI)
Boot0002* UEFI:CD/DVD Drive     BBS(129,,0x0)
Boot0003* UEFI:Removable Device BBS(130,,0x0)
Boot0004* UEFI:Network Device   BBS(131,,0x0)

I've thought about switching to the dist kernel for a while .. lots of oddities from my configuration over the years. Not sure what was going on, but hopefully this helps someone in the future. Feel free to close this.