cannot open 'root/0': dataset does not exist

adwarin commented 1 month ago

Hello,

First of all, I want to thank you for the work you've done here. Using your script has made the installation process for installing Fedora in a ZFS on root configuration super easy!

I'm curious about what I should look for to know if ZFS isn't compatible with a particular kernel version.

For example:

After checking here to ensure that kernel version 6.10.9-200 would be supported (6.10 kernels are indicated as being supported) by ZFS version 2.2.6, I used the kernel-update script and installed kernel version 6.10.9-200. After rebooting to test the new kernel, I'm getting error messages to the effect of "cannot open 'root/0': dataset does not exist".

My machine boots successfully if I choose an earlier version of the kernel so I was thinking that this likely means that the latest version of ZFS isn't compatible with the chosen kernel. Do you think that error is likely indicative of a ZFS/kernel compatibility issue?

Alternatively, do you know if there is an error I should be looking for which would indicate an incompatibility?

Thanks!

gregory-lee-bartholomew commented 1 month ago

I don't think it is a compatibility issue with the kernel. You did exactly what you were supposed to do by checking that website before updating the kernel. The problem is more likely something to do with a script or a configuration file somewhere.

To troubleshoot this sort of issue, I would try adding rd.break to the list of kernel parameters (I usually remove quiet and rhgb while I'm at it, but that is not required). When the root filesystem fails to mount, it should drop you to the dracut emergency shell and from there you should be able to run commands like zpool list or zfs list to inspect the storage. systemctl --failed might tell you what startup script has failed and then systemctl status <some-service-name> might give you further details about why it failed.

The "dataset does not exist" error sounds like maybe you renamed the root filesystem? If so, that is OK, you probably just neglected to update /etc/kernel/cmdline to match the new filesystem name before you updated to a new kernel.

I'm happy to help you troubleshoot this issue, just let me know what you find. 🙂

Edit:

My machine boots successfully if I choose an earlier version of the kernel so I was thinking that this likely means that the latest version of ZFS isn't compatible with the chosen kernel. Do you think that error is likely indicative of a ZFS/kernel compatibility issue?

I just updated my PC to kernel 6.10.9-200.fc40.x86_64 and it has booted successfully. However, I have a few non-standard startup scripts configured on my PC, so just because my system boots doesn't necessarily mean a more standard configuration would. I'll share the additional configuration that I use if it turns out that it is needed.

Edit2:

I've updated another system that I use for testing to 6.10.9-200.fc40.x86_64 and it also updated without a hitch. My test system is running a completely standard Fedora minimal installation.

adwarin commented 1 month ago

I don't think I renamed my boot pool but I might be reading this wrong:

zfs list

I checked Systemctl but unfortunately it doesn't seem like it has much additional data:

Screenshot 1

Screenshot 2

Interestingly an unrelated zpool shows up when I check zpool list and I can manually import the root pool from the emergency shell.

Here is a more detailed log from the rdsosreport, I'm not super knowledgeable on ZFS but I don't see anything major failing before the root pool isn't located: Detailed Boot Log

gregory-lee-bartholomew commented 1 month ago

ZFS has a couple of services that attempt to find and import pools on system startup. They are zfs-import-cache.service and zfs-import-scan.service. The former attempts to keep track of what pools were imported at some earlier point in time and re-import those same pools on system startup. The latter imports all pools as a fallback option if /etc/zfs/zpool.cache doesn't exist or is empty. Personally, I don't like either of the default ZFS pool import strategies.

A problem with the zfs-import-cache service is that the zpool.cache file it uses can be out-of-date in a root on ZFS configuration. When running root on ZFS, the scripts have to use the zpool.cache file that is in the initramfs archive, but when that was last updated depends on when the user last updated their kernel (or ran dracut -f).

A problem with the zfs-import-scan service is that it can import the wrong pool (e.g. if the user has other partitions on their system containing valid ZFS pools that they use, for example, to run other OS instances in virtual machines).

For these reasons, I prefer to disable (mask) zfs-import-cache.service and override zfs-import-scan.service with a custom import command that explicitly imports only the pool that contains my system's root filesystem (identified by its partition UUIDs).

It looks like you might be hitting one of these problems with the default ZFS import scripts. If you want to try the import strategy I prefer, here are some instructions.

Mask zfs-import-cache.service by running systemctl mask zfs-import-cache.service.
Create a /etc/systemd/system/zfs-import-scan.service.d/override.conf file with contents similar to the following.

# cat /etc/systemd/system/zfs-import-scan.service.d/override.conf
[Unit]
ConditionFileNotEmpty=

[Service]
ExecStart=
ExecStart=/sbin/zpool import -f -N -o cachefile=none -d /dev/disk/by-partuuid/a3c52c80-93b0-41cc-85c9-3ea0cb013503 -d /dev/disk/by-partuuid/b1426d54-b728-440b-9651-5c83f13c48e6 root $ZPOOL_IMPORT_OPTS

You will have to change the partition UUIDs in the override.conf file to match the ones on your PC. One way you should be able to find the correct UUIDs is by running zpool list -v root.

# zpool list -v root
NAME                                       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
root                                       110G  66.5G  43.5G        -         -    22%    60%  1.00x    ONLINE  -
  mirror-0                                 110G  66.5G  43.5G        -         -    22%  60.5%      -    ONLINE
    b1426d54-b728-440b-9651-5c83f13c48e6   111G      -      -        -         -      -      -      -    ONLINE
    a3c52c80-93b0-41cc-85c9-3ea0cb013503   111G      -      -        -         -      -      -      -    ONLINE

The partuuids can also be found with commands like lsblk --filter 'TYPE == "part" && FSTYPE =~ "zfs*"' -o label,name,partuuid or ls -al /dev/disk/by-partuuid.

Once you've masked zfs-import-cache.service and overridden zfs-import-scan.service, you will need to regenerate one of your initramfs images with dracut to test it. Do not use dracut's --regenerate-all option! If you do, and there is an error in the configuration of the zfs-import-scan service, none of your boot menu options will work anymore and you will be locked out of your system. Instead, use a command like the following to regenerate only one specific initramfs image and leave the others untouched so you can fall back to them if you need to.

# dracut -f /boot/$(</etc/machine-id)/6.10.9-200.fc40.x86_64/initrd 6.10.9-200.fc40.x86_64

You can use a command like lsinitrd /boot/$(</etc/machine-id)/6.10.9-200.fc40.x86_64/initrd | grep zfs-import-scan.service.d to verify that the initramfs image was successfully regenerated with your override script.

Let me know if this workaround resolves your problem or if you would prefer to try something else.

adwarin commented 1 month ago

Okay, I ran the above and feel like we're getting closer.

I masked my zfs-import-cache.service

Then, I got my root pool partition UUIDs: 08634C04-08D0-4BB2-B27C-9F3801A3FCB0

After that, I created an override file with the UUIDs from above:

# cat /etc/systemd/system/zfs-import-scan.service.d/override.conf
[Unit]
ConditionFileNotEmpty=

[Service]
ExecStart=
ExecStart=/sbin/zpool import -f -N -o cachefile=none -d /dev/disk/by-partuuid/4e48f362-2047-4d64-86ac-ff91bc295940 -d /dev/disk/by-partuuid/aff95d6c-f3d4-4ed2-a23c-71e5873a1f70 root $ZPOOL_IMPORT_OPTS

I ran the command to regen the relevant initramfs:

# dracut -f /boot/$(</etc/machine-id)/6.10.9-200.fc40.x86_64/initrd 6.10.9-200.fc40.x86_64

Unfortunately, the boot doesn't succeed but I do see the zfs-import-scan running: E89A2CFD-AA37-43EF-B958-EC37FC8ED993

I appreciate you patience and the detailed write-up!

Do you think the dracut command you supplied would work with an unmasked zfs-import-cache.service if I ran it manually after each kernel update?

gregory-lee-bartholomew commented 1 month ago

Do you think the dracut command you supplied would work with an unmasked zfs-import-cache.service if I ran it manually after each kernel update?

Running the dracut command manually after kernel updates is unlikely to help. You don't want to have both that customized zfs-import-scan.service and an unmasked zfs-import-cache.service in your initramfs because they would both attempt to run and that is not how the system is designed to work (it should only attempt to import the root pool once).

Unfortunately, the boot doesn't succeed but I do see the zfs-import-scan running:

Actually, the screenshot you provided appears to show that zfs-import-scan finished, as did sysroot.mount. The last message I see is "Warning: Break before switch_root". Did you leave rd.break set on the kernel command line? If so, remove it and I think everything should be working now.

adwarin commented 1 month ago

That did it, thank you!

gregory-lee-bartholomew commented 1 month ago

If you don't mind, I'd like to leave this issue open for a while in case it turns out to be a "common issue". I might need to add a little code to the installation scripts to configure the system this way by default if it turns out that people are hitting this problem.

Fedora on ZFS users: Let me know with comments to this issue report if you are hitting this problem and I need to revise the installation script to have the installed system force-import the root pool by its constituent partition UUIDs.

gregory-lee-bartholomew / fedora-on-zfs

cannot open 'root/0': dataset does not exist #3