zfsonlinux / grub

GRUB enhancements for ZFS on Linux
16 stars 18 forks source link

grub.cfg incorrectly includes a / after zpool name and halts system on boot #25

Open almereyda opened 6 years ago

almereyda commented 6 years ago

System information

Attribute Value
Distribution Name Ubuntu
Distribution Version 16.04.4
Linux Kernel 4.4.0-97-generic
ZFS Version 0.6.5.6-0ubuntu18

Describe the problem you're observing

When installing from an Ubuntu 16.04.3 installimage at Hetzner, installation of a ZFS-capable GRUB via a chrooted environment and reboot into such works as expected, as outlined by the recipies in Ubuntu 16.04 root on ZFS informed by Ubuntu on ZFS root on Hetzner Server.

If the system is upgraded immediately after set up with apt update && apt upgrade -y prior to installation of zfs-dkms and zfs-initramfs, the resulting GRUB file will not contain the right pool identifier to continue.

Describe how to reproduce the problem

  1. Install Ubuntu 16.04.3 via a Hetzner installimage.
  2. Update it immediately via apt update && apt upgrade -y
  3. Move the system from the installation on i.e. ext4 to a zfs partition and install GRUB to the respective disk's boot_grub-flagged 1M partition via a chroot environment. Verify to be in the correct system with grub-probe /.
  4. Reboot the system and see the system being stuck at loading the zpool during initramfs.

Include any warning/errors/backtraces from the system logs

Message: filesystem 'rpool/' cannot be mounted, unable to open the dataset

The system will try to auto-import rpool/, containing a slash, which cannot be found. Manually removing the trailing / from grub.cfg's linux lines after running update-grub in the chroot and only then installing GRUB helped to circumvent the issue.

Trying with root=ZFS=rpool/ROOT/ubuntu didn't help solving the case.

Possibly (in)directly related to

fejesjoco commented 6 years ago

The bug is probably in Grub, I reported in, no response in 3 months: http://savannah.gnu.org/bugs/?52746

almereyda commented 6 years ago

This was due to a (malconfigured?) system, that directly resides under an rpool's (implicit?) dataset. It surfaced again when attempting a reboot of the machine.

# zfs list
NAME                                                                               USED  AVAIL  REFER  MOUNTPOINT
rpool                                                                        1.94G  1.90G  1.87G  /
rpool/ROOT                                                                    192K  1.90G    96K  none
rpool/ROOT/ubuntu                                                              96K  1.90G    96K  /

It resulted in an initramfs showing

Begin: Setting mountpoint=/ on ZFS filesystem rpool/ ... done.
Begin: Mounting ZFS filesystem rpool/ ... done.
Command: mount -t zfs -o zfsutil rpool/ /root
Message: filesystem 'rpool/' cannot be mounted, inable to open the dataset
mount: mounting rpool/ on /root failed: No such file or directory
Error: 1

Manually mount the root filesystem on /root and then exit

There seems to be a / too much in the automount rule at system startup. The following allowed to boot the system:

mount -t zfs -o zfsutils rpool /root
exit

Everything is stored and mounted directly from the zpool's mountpoint. This is visible in /boot/grub/grub.cfg as

        linux   /@/boot/vmlinuz-4.4.0-130-generic root=ZFS=rpool/ ro  nomodeset net.ifnames=0

/etc/grub.d/10_linux contains the rule. It adds the slash, since it is usually used in LINUX_ROOT_DEVICE="ZFS=${rpool}${bootfs}", expecting a viable dataset containing the final mount point.

A monkey patch can be applied as per

sed -i 's|rpool/|rpool|g' /boot/grub/grub.cfg
update-initramfs -u -k all
grub-install /dev/disk/by-id/ata-...
grub-install /dev/disk/by-id/ata-...
reboot

Indeed rewriting the 10_linux template to catch the case of rpool's being datasets in the same time could help mitigate that error for further invocations of update-grub && update-initramfs && grub-install.

And yes, it is very much the case you are reporting in the upstream issue.