Open darkpixel opened 1 year ago
@darkpixel --monitor-health will check for corruption but in the READ WRITE CKSUM columns. In most cases these will increment in case of permanent errors but I myself experienced such permanent errors without actual data loss because of code issues. (https://github.com/openzfs/zfs/issues/12014)
That's a fun bug @phreaker0. I ran into something similar, but a reboot and a scrub during normal operation usually fixes it. If not, I just nuke the snapshot and re-scub during normal operation to fix it.
@darkpixel I needed to reboot and then scrub to fix the issues, but it would reappear again after some days. I recreated my pool and now the problem is gone.
Just a heads up that the
--monitor-health
flag doesn't catch all bad situations:I think it's just paying attention to the
state: ONLINE
and the fact that the drives are all working properly. It's not paying attention to corruption.The whole reason I noticed this is because
--monitor-snapshots
briefly complained about snapshots being old. For some reason Sanoid took the snapshots, but zfs flagged them as corrupt. I deleted them, and that's why the "errors" section above doesn't mention them by name.