pool corruption with zfs_no_scrub_io=1

speed47 commented 2 years ago

System information

Type	Version/Name
Distribution Name	Ubuntu
Distribution Version	22.04 LTS
Kernel Version	5.15.53 (vanilla)
Architecture	x86_64
OpenZFS Version	2.0.0 - 2.1.99

Describe the problem you're observing

When zfs_no_scrub_io=1, the attached reproducer script will corrupt a pool with a few zpool add/remove/attach commands:

Jul 30 2022 19:16:58.552762001 ereport.fs.zfs.checksum
        class = "ereport.fs.zfs.checksum"
        vdev_cksum_errors = 0x13
        cksum_expected = 0x164aaffba7 0x8c63ad04198 0x1c449b2d20229 0x3e24e68c61feb7
        cksum_actual = 0x0 0x0 0x0 0x0
        cksum_algorithm = "fletcher4"

  pool: test948894687234
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 992K in 00:00:00 with 0 errors on Sat Jul 30 19:16:58 2022
remove: Removal of vdev 3 copied 1.36M in 0h0m, completed on Sat Jul 30 19:16:56 2022
        240 memory used for removed device mappings
config:

        NAME              STATE     READ WRITE CKSUM
        test948894687234  ONLINE       0     0     0
          /dev/shm/12ga   ONLINE       0     0     0
        special
          mirror-1        ONLINE       0     0     0
            /dev/shm/1ga  ONLINE       0     0     0
            /dev/shm/1gf  ONLINE       0     0 1.94K

As long as zfs_no_scrub_io is supposed to only impact a zpool scrub, this behavior seems to be unwanted, and might be a side effect of this tunable. This is non-reproducible when zfs_no_scrub_io=0.

Tested with OpenZFS 2.0.0 and 2.1.99 (latest commit)

Describe how to reproduce the problem

reproducer.sh

``` #! /bin/bash set -e dir=/dev/shm poolname=test948894687234 v12ga=$dir/12ga v1ga=$dir/1ga v1gb=$dir/1gb v1gc=$dir/1gc v1gf=$dir/1gf truncate -s 1024M $v12ga truncate -s 64M $v1ga $v1gb $v1gc $v1gf data() { echo -n "writing data... " set +e ( while :; do echo "/$poolname/$RANDOM$RANDOM$RANDOM" done ) | timeout 1 xargs touch set -e echo "done" } action() { local cmd=$1 shift zpool "$cmd" "$poolname" "$@" echo "zpool $cmd $@" zpool wait $poolname } try() { zpool create -f -o failmode=continue $poolname $v12ga action add special $v1ga $v1gb data action add special $v1gc data action remove $v1gb data action remove $v1gc data action attach $v1ga $v1gf data action scrub if zpool status $poolname | grep -q 'unrecoverable error'; then echo "problem reproduced!" zpool events -v $poolname | grep cksum zpool status $poolname exit 0 else echo "didn't reproduce the problem, trying again..." umount /$poolname zpool destroy $poolname fi } umount /$poolname || true zpool destroy $poolname || true while :; do try; done ```

Include any warning/errors/backtraces from the system logs

stale[bot] commented 1 year ago

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

speed47 commented 1 year ago

Issue still present in zfs-2.1.12

openzfs / zfs