Open rincebrain opened 7 months ago
To be clear, this isn't "just" a metadata scan:
phantasm 1.11T 7.99T 215 77 204M 596K
11756750817815708724 1.11T 7.99T 215 77 204M 596K
1275799305023947155 - - 109 37 103M 290K
15641244010099601826 - - 107 39 101M 306K
It's really doing the L0 reads then going "nothing there, nevermind"
So I have this, but mine has broken to the point where it does nothing.
I detached a device from a pool that was being resilvered to. I have no way to stop/restart the resilver. I issue the resilver command via zpool, my counters went back to 0 and 8 hours later, are still at 0. No pool activity.
state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Sat Mar 23 22:50:52 2024 0B / 331T scanned, 0B / 331T issued 0B resilvered, 0.00% done, no estimated completion time
pool IO is minimal, just regular reads from applications. Have imported/exported pool multiple times.
openzfs-2.2.3, ubuntu 22.04 w/ Kernel 6.5.0-26-generic (lts HWE kernel) Can't issue a scrub because a resilver is in progress.
Flipping this to 1/0
/sys/module/zfs/parameters/zfs_scan_suspend_progress
seems to have gotten it to start again. Hopefully it makes some progress.
Describe the feature would like to see added to OpenZFS
(The below shows doing this with
zpool attach
, but the same behavior occurs withzpool replace
thendetach
, it was just faster to demonstrate this way.)It should probably not keep "resilvering" when there's nothing to do...
How will this feature improve OpenZFS?
I was reminded of this longstanding bug when someone was confused after they tried a
replace
, realized their mistake,detach
ed the replacing leg, then did anattach
, and then wondered why the "resilvered" count wasn't going up - it was, of course, becauseresilver_defer
means it was waiting until this finished in the first place to actually do what the user asked for, per #14505.Additional context
This has been true for a long time - the above is on 2.1, but I believe it's been true since before 0.7.