openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.55k stars 1.74k forks source link

Cannot unmount, Unmount Failed - Debian Upgrade #14966

Open tunloop opened 1 year ago

tunloop commented 1 year ago

System information

Type Version/Name
Distribution Name Debian
Distribution Version Bookworm (12)
Kernel Version 6.1.0-9-amd64
Architecture amd64
OpenZFS Version 2.1.11-1

Describe the problem you're observing

After upgrading from version 2.0.3-9 to version 2.1.11-1 (debian bullseye to bookworm), a ZFS mountpoint becomes a ghost. There is no data in the claimed mounted directory, and the mount point cannot be unset, cannot be forcefully unmounted, and cannot be exported. Removal of the mountpoint folder still results in mount and zfs reporting the mount to be present.

I would say this is a pretty large bug, it effectively destroys access to a zfs pool. Since the pool cannot be unmounted from this ghost mount, it cannot be exported, and the mountpoint cannot change.

Describe how to reproduce the problem

Install ZFS on Debian 11 and create pool mounted to folder inside non-root user home directory. Upgrade Debian 11 to 12, and reboot. ZFS mountpoint will now be permanent without the dataset actually being mounted.

Include any warning/errors/backtraces from the system logs

There are no system logs generated from this event, only the observed errors in mount functionality. All commands below run as root.

zpool status -v

  pool: SecureArchive
 state: ONLINE
  scan: scrub repaired 0B in 00:07:08 with 0 errors on Fri Jun  9 09:59:30 2023
config:

        NAME                                               STATE     READ WRITE CKSUM
        SecureArchive                                      ONLINE       0     0     0
          mirror-0                                         ONLINE       0     0     0
            ata-Samsung_SSD_860_EVO_250GB_X  ONLINE       0     0     0
            ata-Samsung_SSD_860_EVO_250GB_X  ONLINE       0     0     0

errors: No known data errors

lsblk

NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda           8:0    0 232.9G  0 disk 
├─sda1        8:1    0 232.9G  0 part 
└─sda9        8:9    0     8M  0 part 
sdb           8:16   0 232.9G  0 disk 
├─sdb1        8:17   0 232.9G  0 part 
└─sdb9        8:25   0     8M  0 part

zfs get mountpoint

NAME                                                  PROPERTY    VALUE                             SOURCE
SecureArchive                                         mountpoint  /home/erasedhammer/SecureArchive  local

mount | grep Archive

SecureArchive on /home/erasedhammer/SecureArchive type zfs (rw,xattr,noacl)

cat /proc/mounts | grep Archive

SecureArchive /home/erasedhammer/SecureArchive zfs rw,xattr,noacl 0 0 

zfs set mountpoint=/home/erasedhammer/SecureArchive SecureArchive

cannot unmount '/home/erasedhammer/SecureArchive': unmount failed

zfs unmount -f SecureArchive

cannot unmount '/home/erasedhammer/SecureArchive': unmount failed

umount /home/erasedhammer/SecureArchive

umount: /home/erasedhammer/SecureArchive: not mounted.
tunloop commented 1 year ago

Not sure if this was the determining factor, but it appears I have fixed the problem.

Originally I had the mountpoint for my pool on a separate drive than my boot drive (/home is mounted there). I moved the mountpoint back to the boot drive and it mounts consistently now.

I can only guess that the appearance of /home partition on the secondary drive takes a little too long for the process that mounts the zfs pool.

rlebreto commented 1 year ago

Hi @tunloop I have the same issue as you. After being blocked for 6 hours, this case and your workaround saves my day.

In my case, I am on Centos 7. I am re-installing a server and the only thing that changed is centos version : centos 7.7 to centos 7.9 (and consequently kernel from 3.10.0-1160.42.2.el7.x86_64 to 3.10.0-1160.88.1.el7.x86_64). Our zfs mountpoint is /var/lib/pgsql and /var is a xfs linux filesystem.

I suppose that the order of mount has changed during the bootstrap (and /var/lib/pgsql is now mounted before /var), but it is difficult to prove because /var is not easy to umount ;-)

Thanks for reporting...

On my side :

Type | Version/Name -- | -- Distribution Name | Centos Distribution Version | Centos 7.9 Kernel Version | 3.10.0-1160.88.1.el7.x86_64 Architecture | x86_64 (intel) OpenZFS Version | zfs-2.0.5 and zfs-2.0.7