Closed stephane-chazelas closed 11 years ago
Your call trace matches #861.
Right this looks like a duplicate of #861.
Well, it is different in that I don't get any "rcu_sched detected stall", the umount returns fine, the export doesn't hang but returns with EBUSY, but indeed they look similar (and to #790).
Any recommendation on what I should try and do the next time it happens?
Once it happens there's nothing really which can be done. We needs to happen is for us to identify the exact flaw and see if/how it can be worked around and then properly fixed.
I got a second occurrence of the issue described at http://thread.gmane.org/gmane.linux.file-systems.zfs.user/4661
I've been doing an "offsite backup" every week, whereby I zfs-send|zfs-recv a number of datasets from one zpool onto another zpool on a pair of hard drives (well luks devices on top of hard drives). I do a zfs export, luksClose before taking the drives offsite.
Today, for some reason, the zfs export fails with:
There is no zfs command running, nothing mounted (zpool export managed to do that part) on there (checked /proc/mounts as well), nothing uses the zvols in there, no loop device or anything. I've tried to killall -STOP udevd in case it was somehow accessing stuff while the export was trying to tidy them away.
I've got a sysrq-t output, not sure what to look for to see what may be holding it.
Trying to "zfs mount -a" to see if I can mount it back, it says for every mount point:
filesystem 'offsite-backup-05/main/servers/skywalker/shadow_nbd/c' is already mounted cannot mount 'offsite-backup-05/main/servers/skywalker/shadow_nbd/c': Resource temporarily unavailable
While "grep offsite-backup-05 /proc/mounts" returns nothing.
So there's something definitely going wrong there.
I can still read the zvols on there, though.
I have the zevents going to the console (zfs_zevent_console=1) and there has been nothing (no IO error, no nothing, I used to get a lot of oops, but since upgrading the memory to 48GB, it has been quite stable until now).
Before rebooting, I also tried to export the other zpool (the one I was "zfs send"ing from) and got the same EBUSY error (succesful umount but EBUSY upon the ioctl(ZPOOL_EXPORT) as for the other one).
I noticed (in top) an arc_adapt taking 100% of 1 CPU. Running a sysrq-l a few times showed each time it being in:
In case that talks to anybody.