openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.61k stars 1.75k forks source link

zfs-mount tries to run exoprtfs before nfs is setup #14149

Open beren12 opened 2 years ago

beren12 commented 2 years ago

System information

Type linux
Distribution Name Debian
Distribution Version 11
Kernel Version 5.10.140
Architecture amd64
OpenZFS Version 2.1.5

Describe the problem you're observing

there is a 300s pause during boot when zfs-mount is run during boot

Describe how to reproduce the problem

Have a few zfs shares in /etc/exports, reboot

Include any warning/errors/backtraces from the system logs

I verified it was in zfs-mount by adding set -e to the script. I then checked the debug log in /sys, nothing was going on:

2022-10-29 13:29:18   metaslab.c:2436:metaslab_load_impl(): metaslab_load: txg 47224769, spa oceantank, vdev_id 0, ms_id 399, smp_length 14824, unflushed_allocs 434692096, unflushed_frees 204800, freed 0, defer 0 + 0, unloaded time 439258 ms, loading_time 2 ms, ms_max_size 15930908672, max size error 15930703872, old_weight 840000000000001, new_weight 840000000000001

I added strace and found it. see attached files:

     0.000115 sendmmsg(5, [{msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\242\355\1\0\0\1\0\0\0\0\0\0\4dale\0\0\1\0\1", iov_len=22}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=22}, {msg_hdr={msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\277\364\1\0\0\1\0\0\0\0\0\0\4dale\0\0\34\0\1", iov_len=22}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, msg_len=22}], 2, MSG_NOSIGNAL) = 2
     0.000176 poll([{fd=5, events=POLLIN}], 1, 5000) = 0 (Timeout)
     5.005290 close(5)                  = 0

Part of this is due to dns lookup failure so early in boot. Can zfs please not call exportfs before the service itself is running? And maybe more output with zfs mount -v, like printing "Updating nfs exports" when it gets to that step.

zfs-mount.1667657789.14245.txt

rincebrain commented 2 years ago

Worth nothing that, IIUC, 5e7a2f4665b5be32dab9c183e6fdb94e1f434b70 is what results in this, so either this should get fixed or that should be reverted before a release if we care about breaking people using that script.

beren12 commented 2 years ago

While I don't disagree, I lean more towards it getting fixed. Maybe zfs-mount should declare a networking dependancy in the init/systemd scripts, or change to not call a networking binary until after the network is setup.

I only recently started using nfs so this was never an issue even after https://github.com/openzfs/zfs/commit/5e7a2f4665b5be32dab9c183e6fdb94e1f434b70 was added until I populated /etc/exports with a few shares and hostnames. It's a dns resolution timeout issue because networking is setup after zfs-mount is run.

rincebrain commented 2 years ago

I absolutely agree it should be fixed, just remarking that one of those two should happen before a release, and I don't know how long until next release or this gets fixed.

Call it an explicit dependency declaration. ;)

beren12 commented 2 years ago

It's also not documented in the manpage that exportfs is called with zfs mount which should likely be added as well.

beren12 commented 2 years ago

Worth nothing that, IIUC, 5e7a2f4 is what results in this, so either this should get fixed or that should be reverted before a release if we care about breaking people using that script.

Only in sysvinit, it exists in systemd as well since it has no networking/nfs dependency there, either. Whether or not it's noticed I have no idea.