Open rincebrain opened 6 months ago
As a dirty hack: what about making zpool clear
generally re-queue all in-flight ZIO for the pool?
It turns out to be very messy and buggy to do that, when I tried it. (Also, I think some IOs aren't necessarily safe to requeue - you'd basically need to reissue the entire txg's worth of IOs to the device, I think? e.g. if you had more than one queue's worth of IOs to issue, and the device went away while they were in the write cache, whether they actually landed depends on how "away" the device went...and you might no longer have them in queue to issue since nominally it claimed they went out.)
It's also somewhat difficult to test, since Linux's behavior on hot-removing a device in use is not...always consistent, and you're not always at the same point in the IO pipeline when you do it.
ZIOs no longer in the queue don't matter, they can no longer block and we don't have the information to recreate them anyway...
But all still tracked by ZFS could simply be restarted (as they're not on disk yet and/or something is waiting for them) to remove the deadlock and actually persist all data we still have access to... possibly this restart operation could auto-trigger a partial scrub (from the last known-good TXG - or some back to be sure), to automatically check for writes that might have made it out of the ZIO pipeline but got lost on the way to stable storage.
🤔
On 30 May 2024 10:56:43 CEST, Rich Ercolani @.***> wrote:
It turns out to be very messy and buggy to do that, when I tried it.
It's also somewhat difficult to test, since Linux's behavior on hot-removing a device in use is not...always consistent, and you're not always at the same point in the IO pipeline when you do it.
-- Reply to this email directly or view it on GitHub: https://github.com/openzfs/zfs/issues/16175#issuecomment-2139081827 You are receiving this because you commented.
Message ID: @.***>
System information
Describe the problem you're observing
If I have a USB hard drive that I created a pool on, using the name
/dev/disk/by-id/mydrive-part1
, pointing to/dev/sda1
, unplug it, and replug it, ZFS will say it's SUSPENDED because the disk is gone.Great. The device got the name
sdb
sincesda
was still being held open when the disk reappeared, I zpool clear that device (by guid, since the device's name won't work, which is a separate complaint), and it marks the pool and device ONLINE after one or two iterations ofzpool clear pool
orzpool clear pool guid
, hurray....except the clear command never returns, IO from before the hotplug is hung forever, the old device name is still present (e.g. something is holding references to it still), and the stacktrace is in
zio_resume
.New writes to the pool appear to succeed, but old IO is stuck forever.
zfs_ioc_clear
is blocking onzio_resume
is blocking onzio_wait
, andtxg_wait
andcp
say they're blocking on:So I assume it's some dance like nothing told the old
zio
s that their reference to the disk they're trying to write to is stale. (zpool reopen
doesn't save you.)Describe how to reproduce the problem
Above.
Include any warning/errors/backtraces from the system logs
Above.