openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.32k stars 1.72k forks source link

zfs-mount.service is called too late on Debian/Jessie with ZFS root #4474

Closed arcenik closed 4 years ago

arcenik commented 8 years ago

problem

As working on a script to install Debian on a ZFS root, I've encountered a bug. The script is here : https://github.com/arcenik/debian-zfs-root .

On a Debian Jessie installation with ZFS root, the zfs-mount.service is called too late, therefore some stuff are being created in /var preventing the proper mount of rpool/var.

error

As you can see, /var is not mounted. And some content like /var/lib is missing.

mounted-zfs

proposed solution

Here is a working solution : to mount the rest of zfs volumes while the root is still read-only.

--- zfs-mount.service.orig +++ zfs-mount.service @@ -8,6 +8,7 @@ After=zfs-import-cache.service After=zfs-import-scan.service Before=local-fs.target +Before=systemd-remount-fs.service

[Service] Type=oneshot

FransUrbo commented 8 years ago

LGTM. I'll include this in the upcoming 0.6.5.6 package. Thanx.

behlendorf commented 8 years ago

@arcenik what might the side effects be to always running this before systemd-remount-fs.service. I can see why it's desirable in the root fs case, but what about if your not using zfs as your root fs.

Here's a link to the documentation for this service for reference:

https://www.freedesktop.org/software/systemd/man/systemd-remount-fs.service.html

arcenik commented 8 years ago

A first side effect for this solution is that zfs-mount.service cannot write anything to the file system. But I don't think this is a problem because error, warning and other messages should be sent to logging system and not written to / filesystem.

A second side effect could be to reverse the problem, preventing the proper mount of /data is a zfs volume is mounted in /data/zfs (for example).

A nice solution could be to have zpool loaded before systemd-remount-fs and zfs volumes (or zpool ?) declared in /etc/fstab .

tuxoko commented 8 years ago

We really should make zfs work in /etc/fstab. This is the only way it would work if the system is mixed with zfs and other fs.

behlendorf commented 8 years ago

@tuxoko agreed. There was some discussion in #4178 about how to do this by leveraging some of the infrastructure added to systemd for btrfs which has a similar issue.

tuxoko commented 8 years ago

@behlendorf There's one important different between ZFS and BTRFS in this regard. BTRFS doesn't need to do "pool import", you directly mount each subvolume as soon as disks are ready.

Perhaps we would want to have a zpool.target which would look into fstab and know which pool we need to import, rather than the current "import all pool we've scanned".

behlendorf commented 8 years ago

Right, there are definitely significant differences which will need to be carefully worked through. I didn't mean to imply it's going to be a 1-1 mapping, but it may provide some useful functionality for us to leverage.

arcenik commented 8 years ago

@tuxoko I think that adding a dependency for /etc/fstab to zpool is not a good idea. Another way is to have a autoimport property on each zpool (enabled by default and older version).

arcenik commented 8 years ago

Here some examples for /etc/fstab.

For a zfs root :

LABEL=BOOT       /boot      ext4         defaults        0 0
LABEL=SWAP       none       swap         defaults        0 0
rpool            /          zfs          defaults        0 0

Mounting zpool and zfs volumes:

LABEL=ROOT       /                  ext4   defaults        0 0
LABEL=BOOT       /boot              ext4   defaults        0 0
LABEL=SWAP       none               swap   defaults        0 0
LABEL=DATA       /data              ext4   defaults        0 0
pool1            /data/pool1        zfs    defaults        0 0
pool2            /data/pool2        zfs    defaults        0 0
pool3/vol1       /data/pool3/vol1   zfs    defaults        0 0
pool3/vol2       /data/pool3/vol2   zfs    defaults        0 0
pool3/vol3       /data/pool3/vol3   zfs    defaults        0 0

Of course, when a zpool is specified, all volumes are mounted. To mount only the root volume of a zpool you could use myzpool/

azeemism commented 8 years ago

@arcenik, isn't /usr also missing in your picture?

@behlendorf, doesn't the use of the overlay feature solve this?

...Or does the use of the overlay function create other problems?

On my VirtualBox ZFS "/" test using the feature overlay=on and ZoL release 0.6.5.2-2 for Debian Jessie: 2016-04-04 12_30_37-deb8 x64 zfs sda4 boot running - oracle vm virtualbox

/usr /var and /var/lib are mounted/available: 2016-04-04 12_26_29-deb8 x64 zfs sda4 boot running - oracle vm virtualbox

Not sure if rpool=rpool will create a problem? 2016-04-04 12_40_17-deb8 x64 zfs sda4 boot running - oracle vm virtualbox

/ is commented in fstab and /boot and /boot/efi are not on zfs: 2016-04-04 12_41_46-deb8 x64 zfs sda4 boot running - oracle vm virtualbox

@behlendorf would it be safer/better to always use /etc/fstab for zfs mount points in addition to the overlay feature?

arcenik commented 8 years ago

@azeemism /usr is not missing, is it not mounted because of canmount=off. Furthermore you can see that it is empty.

mklein9 commented 8 years ago

I am a newbie at this, but this particular change to zfs-mount.service in 0.6.5.6, at least when combined with encrypted swap volumes whose keys are randomly generated and are mounted through /etc/fstab, appears to cause a systemd ordering cycle during boot. I just updated ZFS on Debian Jessie to 0.6.5.6, and the boot log shows:

Apr 11 21:52:09 host systemd[1]: Found ordering cycle on systemd-remount-fs.service/start
Apr 11 21:52:09 host systemd[1]: Found dependency on zfs-mount.service/start
Apr 11 21:52:09 host systemd[1]: Found dependency on zfs-import-scan.service/start
Apr 11 21:52:09 host systemd[1]: Found dependency on cryptsetup.target/start
Apr 11 21:52:09 host systemd[1]: Found dependency on systemd-cryptsetup@cryptswap1.service/start
Apr 11 21:52:09 host systemd[1]: Found dependency on systemd-random-seed.service/start
Apr 11 21:52:09 host systemd[1]: Found dependency on systemd-remount-fs.service/start
Apr 11 21:52:09 host systemd[1]: Breaking ordering cycle by deleting job zfs-mount.service/start
Apr 11 21:52:09 host systemd[1]: Job zfs-mount.service/start deleted to break ordering cycle starting with systemd-remount-fs.service/start

/etc/crypttab line for the swap volume:

cryptswap1      /dev/disk/by-id/<swap_vol_id>        /dev/urandom    swap,cipher=aes-cbc-essiv:sha256,hash=ripemd160,size=256

The result is that zfs-mount.service is not started during boot, and must be manually started later.

I have confirmed that removing Before=systemd-remount-fs.service from zfs-mount.service eliminates the ordering cycle and allows a clean boot.

aktau commented 8 years ago

Yep, it indeed creates an ordering cycle.

Depending on which unit systemd deletes to break the cycle, I get all sorts of whacky results on system. From the system not auto-logging in (Kodi not starting), to the ZFS volumes not being mounted.

I built a tool to analyze the cycles better: https://github.com/aktau/findcycles.

With this I reduced the following:

$ systemd-analyze verify default.target |&
  perl -lne 'print $1 if m{Found.*?on\s+([^/]+)}' |
  xargs --no-run-if-empty systemd-analyze dot > remount-cycle.dot
$ dot -Tsvg < remount-cycle.dot > remount-cycle.svg
$ <remount-cycle.dot grep -P 'digraph|\}|green' | findcycles | dot -Tsvg > cycles.svg
$ <remount-cycle.dot findcycles | dot -Tsvg > cycles2.svg

The results:

The most concise one probably being cycles.svg. Curious, I printed out the units that looked suspicious:

$ systemctl cat systemd-remount-fs zfs-mount
# /lib/systemd/system/systemd-remount-fs.service
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.

[Unit]
Description=Remount Root and Kernel File Systems
Documentation=man:systemd-remount-fs.service(8)
Documentation=http://www.freedesktop.org/wiki/Software/systemd/APIFileSystems
DefaultDependencies=no
Conflicts=shutdown.target
After=systemd-fsck-root.service
Before=local-fs-pre.target local-fs.target shutdown.target
Wants=local-fs-pre.target
ConditionPathExists=/etc/fstab

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/lib/systemd/systemd-remount-fs

# /lib/systemd/system/zfs-mount.service
[Unit]
Description=Mount ZFS filesystems
DefaultDependencies=no
Wants=zfs-import-cache.service
Wants=zfs-import-scan.service
Requires=systemd-udev-settle.service
After=systemd-udev-settle.service
After=zfs-import-cache.service
After=zfs-import-scan.service
Before=local-fs.target
Before=systemd-remount-fs.service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/sbin/zfs mount -a

So that made me look for systemd-remount-fs.service, which led me to this thread. It would seem that adding this line provokes a bug on some systems. My system is an up-to-date Debian Stretch (9/testing).

I don't have any encrypted devices, just a regular SSD on which I have an ext4 and a zfs partition, and an external drive which is entirely zfs.

If any more info is needed, I'll be glad to supply it.

ggzengel commented 8 years ago

I think it's not good to force zfs as root fs and leave everything behind.

The normal debian dependencies are (https://github.com/zfsonlinux/pkg-zfs/issues/205): zfs-import-scan.service -> cryptsetup.target -> systemd-cryptsetup@VG1\x2dswap_crypt.service -> systemd-random-seed.service -> systemd-remount-fs.service Now you force: systemd-remount-fs.service -> zfs-mount.service -> zfs-import-scan.service

After each update I have to edit zfs-mount.service to get an usable system.

Perhaps you have to change zfs-import-scan.service which can break other systems.

It's like @behlendorf said. It's not easy to have all possibilities covered. You have to make multiple targets (pre and post) with same function to get every combination of dependencies supported.

FransUrbo commented 8 years ago

See https://github.com/zfsonlinux/pkg-zfs/issues/205

megajocke commented 7 years ago

I am having a similar issue where /var/spool/ is clobbered before zfs-mount.service runs and mounts it. In my case I think it is a bug in the Debian mdmonitor.service (I'm using md-raid for /boot) which specifies no dependencies without default dependencies although it calls exim4 which recreates its /var/spool/ files.

rlaager commented 4 years ago

This should be resolved. The Root-on-ZFS HOWTO includes a work-around, and with 0.8.x's mount generator, this is correctly solved. I'm going to close this. If this is still an issue for someone, try the workaround of setting mountpoint=legacy on the affected datasets and putting them in /etc/fstab. If that's not sufficient, please comment here. You should be able to comment on closed bugs; if not, email me directly (rlaager@wiktel.com) and I'll re-open.