Open stuartthebruce opened 6 years ago
I think this is a dup of https://github.com/zfsonlinux/zfs/issues/6649
I think this is a dup of #6649
That ticket is for failmode=wait whereas this ticket is for failmode=continue.
yes, but failmode isn't the issue here. The issue is how to remove a suspended pool from the system.
I was hoping that failmode=continue would obviate the need to wait for the enhancement to allow the removal of a suspended pool. My immediate need is to simply return an error and not block. I can live with an unusable suspended pool in the system until I need to reboot for another reason.
The issue seems to stem from the failmode=continue
not aborting existing write requests (as one might expect), but only new ones. From man zpool
:
continue Returns EIO to any new write I/O requests but allows reads to any of the remaining healthy devices. Any write requests that have yet to be committed to disk would be blocked.
Also man zpool
dosn't specify exactly what happens to reads from unhealthy devices (which, for a suspended pool, can possibly be all of them).
Thus, while the behaviour seen by the OP is kind-of as documented, failmode=continue
is IMHO quite useless when effectively behaving identical to failmode=wait
(hanging, unkillable I/O, for the in-flight ones when suspension occured) and should be made to cleanly abort all outstanding I/O (writes and reads) that can't complete as the pool went into suspension.
Possibly a timeout for zio in general (to abort them with a clean error condition on long enough time of inactivity) could solve the issue with I/O being stuck in an unkillable state?
This is related to the work I'm doing to support the "abandonment" of a pool from which, for example, IO has "hung" because the completions are no longer arriving (due to flaky hardware, bad driver, etc.) and for which I worked up a proof-of-concept at the OpenZFS hackathon this year. This issue is sort-of a different instance of the problem (in which a pool can't be exported).
The work to support abandoning a pool for which IO has hung is going to leverage the similarly-named "continue" mode of the zio deadman. I've got a patch almost ready to post as a PR which fixes some of the problems with zio deadman.
This particular issue will require somewhat different handling but it is something I've planned on addressing as part of the larger "zpool abandon" feature.
This particular issue will require somewhat different handling but it is something I've planned on addressing as part of the larger "zpool abandon" feature.
@dweeezil most excellent! I have a large number of unreliable HDD in a Hadoop cluster I would be willing to use to test a ZFS patch with when it is available. I most interested in the ability to optionally not block for, "pool I/O is currently suspended", however, I am also interested in testing the ability to abandon and destroy a zpool without having to reboot. Many thanks for working on this.
I would also like to test this feature, the deadman continue helps alot but sometimes I lose connection and wont be able to recover the connection and would like to not have to reboot.
WIP - Fix issues with zio deadman "continue" mode #8021
@dweeezil do you have a rough estimate on when external testing would be helpful?
Does 0.8.0 changes this behavior? Or make it any easier to implement a fix?
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
I'm opposing staleness. This might be old, but it's an issue.
@GregorKopka I've tagged this issue as a defect. I've also added the "Status: Understood" tag which will prevent the bot from marking it again.
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
@behlendorf
Looks to me as if this bot bot has started a rebellion against mankind.
Bumping this issue because this failure mode appears to be reasonably common to trigger with EBS volumes that go on walkabout.
hello, i also came across this issue as i have a problem with hung kvm VMs if some unreliable single-disk zfs storage going nuts ,which is only used for unimportant/backup task.
i would expect that when failmode=continue being set, that read and write request return EIO. at least for writes, the manpage explicitly tells that EIO is being returned.
but that simply does not happen, whatever read or write it issued , both getting blocked and result in process in uninterruptible state and those cannot be killed, so this is definitively a bug
System information
Describe the problem you're observing
On a zpool with failmode=continue I/O continues to block resulting in un-killable application processes.
Describe how to reproduce the problem
zpool create data1 single_HDD zpool set failmode=continue data1 Start applications performing I/O on zpool and wait for HDD to fail. Attempt to kill application processes and note they end up in the Z state.
Include any warning/errors/backtraces from the system logs
After attempting to kill application pid 33345 it is blocked in the Zombie state and holding kernel resources I need to re-use (in particular a socket).
What I need is for failmode=continue to not block I/O and allow this process to exit so I can start another one to manage a replacement disk in a new pool without having to reboot, i.e., I don't need to be able to destroy the original zpool, though that would be nice as indicated in other open issues.