Closed Jeroen0494 closed 2 years ago
I was able to clear the errors by issuing the following commands from this article: https://serverfault.com/a/846758
jeroen@mediaserver:~$ sudo zpool scrub rpool
jeroen@mediaserver:~$ sudo zpool scrub -s rpool
jeroen@mediaserver:~$ sudo zpool status rpool -v
pool: rpool
state: ONLINE
scan: scrub canceled on Tue May 3 17:56:58 2022
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
5ca738f2-6682-b54e-a259-6dac8cafcbbb ONLINE 0 0 0
errors: No known data errors
jeroen@mediaserver:~$ sudo zpool scrub rpool
jeroen@mediaserver:~$ sudo zpool status rpool -v
pool: rpool
state: ONLINE
scan: scrub repaired 0B in 00:00:24 with 0 errors on Tue May 3 17:57:28 2022
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
5ca738f2-6682-b54e-a259-6dac8cafcbbb ONLINE 0 0 0
errors: No known data errors
Still, I'd like to know how my datasets got corrupted.
@Jeroen0494 this is probably a zfs issue - https://github.com/openzfs/zfs/issues/12014 - could also be a hardware problem with certain ssd's described here: https://vadosware.io/post/starting-2022-with-a-bang-ceph-on-zfs/#debug-data-corruption-rears-its-head-again
@mtippmann thanks for the reply.
You know what, that actually makes sense. My 2x WD 10TB mirror has also been giving me errors lately since I switched from CentoS 7 to Ubuntu 22.04. Both my root and data dataset are encrypted, and I use zsys and sanoid for snapshots, which is the exact scenario they describe in that bug report.
I've read that somebody reported the issues to be resolved since OpenZFS v2.1.4, I'll see if I can update.
Also relevant: https://github.com/openzfs/zfs/issues/11688
Since OpenZFS v2.1.4 the curruption orrurs less frequently.
Hi,
I'm experiencing data corruption because of this plugin on my ZFS file system.
My server is running Ubuntu 20.04 with root on ZFS on an NVMe drive, with k3s and this plugin for containerd. I've setup a separate dataset for container image according to the documentation, and the datasets within it are sometimes experiencing corruption when my server get's rebooted.
In an attempt to fix it - since the datasets hold images that can be re-downloaded anyway - I've stopped k3s and all pods, deleted all image and datasets, and rebooted my server.
Before the cleanup:
The cleanup:
After the cleanup:
The ZFS corruption errors persist, and have become unrecognizable.
Dataset overview for rpool:
How to proceed from here?