openzfsonosx / zfs

OpenZFS on OS X
https://openzfsonosx.org/
Other
824 stars 72 forks source link

Unable to remove UNAVAIL pool with REMOVED disk #726

Open rocwhite123 opened 5 years ago

rocwhite123 commented 5 years ago

I have three USB external drives managed by ZFS on a computer I don't have physical access to. For some reason, after some while, one of the pools would become UNAVAIL and the underlying disk REMOVED. This disk would also disappear from the Disk Utility program, but would still show up in About This Mac -> System Information. Attempting to Restart from the Apple menu (via VNC) will close all open applications but stop just short of signing out the current user. sudo reboot from SSH would put the computer into a non-responding state (answering ping but refusing SSH/VNC connections). At this point, I have no way of interacting with the computer but to ask someone to physically force powering down and up the computer. However, power cycling always solves the problem, so the physical USB connection was not the cause. I suspect the difficulty to reboot is due to some hanging ZFS processes related to the UNAVAIL pool. I cannot zpool clear -nFX BAD_POOL (with the option -nFX nothing happens, and without it "I/O error") or zpool export -f BAD_POOL ("umount failed"). I know USB connection isn't the best option for ZFS, but I wonder if there is any way to purge ZFS's knowledge about this pool in such scenario so that I can at least reboot remotely by myself.

lundman commented 5 years ago

You can sudo reboot -qn though, but only after syncing first as it is a dirty reboot. It sounds like your disk gets disconnected, and if it disappears from DiskUtil there is nothing ZFS can do, the OS does no longer see it.

rocwhite123 commented 5 years ago

Thanks. I'll try that next time this happens. By the way, is there a reason to sync followed by sudo reboot -qn instead of sudo reboot -q?

The disk is still physically cable connected and shows up in "About This Mac -> System Information", but yes, for all practical purposes, the OS does not see it. However, my question is whether there is some command to force ZFS to "forget" such a pool, because otherwise it hangs whenever I try to access it (e.g., ls /Volumes/BAD_POOL). I tried the forced version of zpool clear, offline, export, etc. to no good effect.

ylluminate commented 4 years ago

I'm having perhaps a similar issue. I have a USB disk that's beside me, but has become UNAVAIL:

NAME               SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
host_backup_tank  9.06T  1.53T  7.53T        -         -     2%    16%  1.00x  UNAVAIL  -

Umounting fails:

$ sudo zpool export -f host_backup_tank
Running process: '/usr/sbin/diskutil' 'unmount' 'force' '/Volumes/host_backup_tank/volumename'
Unmount failed for /Volumes/host_backup_tank/volumename
Fallback umount called
Running process: '/sbin/umount' '-f' '/Volumes/host_backup_tank/volumename'
umount: /Volumes/host_backup_tank/volumename: not currently mounted
cannot unmount '/Volumes/host_backup_tank/volumename': umount failed

This is causing Finder and everything to completely hang. Any thoughts on how to handle such a situation? The drive seems alright after reboot...

The situation seems peculiar to me since I was able to see the drive within Disk Utility.app and thus it wasn't seemingly entirely gone... I just don't know why the UNAVAIL status would pop up as it did...

ylluminate commented 4 years ago

Hmm, this keeps happening to me and I can't figure out why. It doesn't happen on internal drives, only on USB - BUT that USB drive in particular really shouldn't be going offline / powering off and thus I'm rather confused by what may be going on here.

beren12 commented 4 years ago

Do you have drive genius? I think it was causing issues for me it thought my zfs part was hfs and scanning.

ylluminate commented 4 years ago

Hmmm, good thought @beren12, but no... I don't have anything doing structure scans.