openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.64k stars 1.75k forks source link

zfs 2.0.1 mount and unload key say the dataset is busy #11446

Open xgiovio opened 3 years ago

xgiovio commented 3 years ago

System information

Type | Version/Name Centos | 7.9 Distribution Name | Distribution Version | Linux Kernel | Architecture | ZFS Version | 2.0.1 SPL Version |

Describe the problem you're observing

ATrying to unmount an encrypted dataset with zfs umount pool/dataset says that the dataset is busy

Tried to close everything, the only thing that worked is umount -l /mountedpoint

Unload key doesn't work, it says that the dataset is busy, even if it has been unmounted with umount -l The encrypted folders are in fact missing because the root has been unmounted and zfs get mounted dataset -> returns no

So, how can i unload the key? Tried with zfs unload-key -a -r Nothing seems to work

Describe how to reproduce the problem

Include any warning/errors/backtraces from the system logs

putnam commented 3 years ago

I also experienced similar behavior. I was able to unmount my datasets with zfs unmount but unable to unload the key from the encryption root with the "dataset is busy" error. This was on 2.0.6 on Debian Bullseye. I am positive everything was unmounted, so I don't know what made the dataset busy. It would be good to have a more descriptive error message, at the least.

xgiovio commented 3 years ago

You could try to update to 2.1

On 1 Nov 2021, at 08:45, Chris Putnam @.***> wrote:

 I also experienced similar behavior. I was able to unmount my datasets with zfs unmount but unable to unload the key from the encryption root with the "dataset is busy" error. This was on 2.0.6 on Debian Bullseye. I am positive everything was unmounted, so I don't know what made the dataset busy. It would be good to have a more descriptive error message, at the least.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

Skaronator commented 2 years ago

I'm on 2.1.1 and got the same issue. Unmounting works, but unloading doesn't:

$ sudo zfs unmount hdd/private
$ sudo zfs unload-key -r hdd/private
Key unload error: 'hdd/private' is busy.
0 / 1 key(s) successfully unloaded
$ zfs --version
zfs-2.1.1-0york0~20.04
zfs-kmod-2.1.1-0york0~20.04
ejpcmac commented 2 years ago

I am facing the same issue today on zfs 2.1.2.

wdoekes commented 2 years ago

On ubuntu/focal here. This is the first time I've seen this, but I'm now unable to:

# dpkg -l | grep zfs
ii  libzfs2linux                          0.8.3-1ubuntu12.14                amd64        OpenZFS filesystem library for Linux
ii  zfs-zed                               0.8.3-1ubuntu12.14                amd64        OpenZFS Event Daemon
ii  zfsutils-linux                        0.8.3-1ubuntu12.14                amd64        command-line tools to manage OpenZFS filesystems
# zfs list -r -t all tank/parent/child
NAME                USED  AVAIL     REFER  MOUNTPOINT
tank/parent/child  19.9G  80.2T     19.9G  -
# zfs mount | grep parent
(void)
# grep parent /proc/mounts
(void)
# grep parent /proc/*/mounts
(void)
# zfs unload-key tank/parent/child
Key unload error: Keys must be unloaded for encryption root of 'tank/parent/child' (tank/parent).
# zfs unload-key tank/parent
Key unload error: 'tank/parent' is busy.
# zfs destroy tank/parent/child
cannot destroy 'tank/parent/child': dataset is busy

A reboot fixed the problem. The key was not loaded afterwards and the child could be destroyed.

NOTES:

:shrug:

Not more to report unfortunately.

aaronjwood commented 2 years ago

Can confirm this exact problem is present in 2.1.6:

filename:       /lib/modules/5.19.7-2-pve/zfs/zfs.ko
version:        2.1.6-pve1
license:        CDDL
author:         OpenZFS
description:    ZFS
alias:          devname:zfs
alias:          char-major-10-249
srcversion:     1DEFB8EF3D6F74821DBEA8E
depends:        spl,icp,zavl,znvpair,zcommon,zlua,zzstd,zunicode
retpoline:      Y
name:           zfs
vermagic:       5.19.7-2-pve SMP preempt mod_unload modversions

FWIW I don't know if this is related to a dataset that contains snapshots. I saw a comment above that mentioned the dataset having snapshots. My dataset also contains snapshots, though at the time of this writing it only has 1.

antler5 commented 1 year ago

I've ran into this issue, pretty sure I'm causing it somehow because I didn't have the issue before switching to legacy mount-points, but just for the record, I tried deleting all the datasets snapshots and that didn't make a difference My creation, mounting, and un-mounting are all scripted already, so I can try to create a reproducible test-case with loopback images, idk, when I've got time again I presume no one else was using legacy mount-point though

Update: I didn't think I had any background jobs or parent shells, and lsof didn't show anything keeping my volumes busy, but closing my shell does seem to address this issue when it pops up.

jonofmac commented 1 year ago

Also having this issue. Cannot unload key without a restart. lsof reports nothing. Mount points are unmounted, not listed in df, and zfs confirms they're unmounted.

I tried killing my shells and logging back in, no luck.

Octofoxy commented 1 year ago

I'm having the same problem. Reboot seems the only option. Please take a look at this bug.

antler5 commented 1 year ago

I'm having the same problem. Reboot seems the only option. Please take a look at this bug.

I was using a contitsent set of scripts when running into the issue and have a lockfile from the time, so I could try to reproduce ( / make reproducable) my issue, eg. in a container with file-backed loopback mounts, but the reality is that filesystem configurations are complicated. My issue might have been an interaction with cryptsetup, or overlapping mounts from re-running non-idempotent scripts, while your underlying issue could be totally different-- note that my hypotheses don't even implicate ZFS itself! We could find an issue with ZFS, and maybe some devs have ideas about how we could more debug info across these cases broadly like these?

Barring that, I feel the onus is still on us to create a reproducable (ie. specific) cirucumstance.

soggier commented 1 year ago

Had a similar but quiet obvious issue caused by an LXC container mount. Stopping the container resolved it but the behavior is unexpected for me.

ZFS version: 2.1.12, release 1.fc38 LXC version: 4.0.12, release 2.fc37

ZFS layout: <parent_fs>/<child_fs> <parent_fs> is an encryption root <parent_fs> had three snapshots, removing them did not make any difference <child_fs> was created with (inherited) defaults

LXC container:

Tests:

After stopping the container, the key could be unloaded in all cases. I'd expect key unloading to succeed after unmounting everything without error or that unmounting would fail with "busy" in the first place. Seems as if the filesystem is lingering like with a lazy unmount or something else is holding on to encryption-related things.

michaelmrose commented 8 months ago

I'm seeing the same thing on 2.2.3 on void linux. The dataset isn't even mounted but the key can't be unloaded. This is true even if I log out my user and log back in.

huntekye commented 4 months ago

This is happening for me as well, Arch Linux 6.9.7-arch1-1, with zfs-2.2.4-1 and zfs-kmod-2.2.4-1 from the archzfs repository. This happened when I tried unloading the encryption key after the first time I had successfully written to an encrypted dataset. Please let me know if there is any other information would be helpful for debugging this issue!