openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.69k stars 1.75k forks source link

Checksum errors with FIO + Direct IO + brd while scrubbing #16695

Open tonyhutter opened 4 weeks ago

tonyhutter commented 4 weeks ago

System information

Type Version/Name
Distribution Name RHEL
Distribution Version 8.10
Kernel Version 4.18
Architecture x86-64
OpenZFS Version master (e5d1f6816758f3ab1939608a9a45c18ec36bef41)

Describe the problem you're observing

Checksum errors while running fio + Direct IO on a brd ramdisk while scrubbing.

Describe how to reproduce the problem

  1. Create a large brd (ramdisk) device
  2. Create a pool with brd device
  3. Config pool for high IOPS
  4. Run fio with --direct=1 while doing a scrub in parallel.

Reproducer:

#!/bin/bash
# Run this script from inside a built ZFS source dir
# Adjust 'size' to fit comfortably within system RAM

devs=1
size=$((100000000 / $devs))
queue=on
recordsize=32k
jobs=32

echo "creating $devs ramdisks that are $size bytes"
modprobe brd rd_nr=$devs rd_size=$size
for i in `seq 0 $(($devs - 1))` ; do
        alldevs="$alldevs /dev/ram$i"
done
sudo ./zpool create -o ashift=12 tank $alldevs
sudo ./zfs set compression=off tank
sudo ./zfs set recordsize=$recordsize tank
sudo ./zfs set atime=off tank
sudo ./zpool get ashift tank

bash -c 'sleep 5 && echo SCRUBBING && ./zpool scrub tank && sleep 10 && ./zpool status' &
fio --name=fiotest --filename=/tank/test1 --size=50Gb --rw=write --bs=$recordsize --direct=1 --numjobs=$jobs --ioengine=libaio --iodepth=128 --group_reporting --runtime=20 --startdelay=1
sudo ./zpool status

sudo ./zpool destroy tank
sudo rmmod brd

I do not see any errors when I run with --direct=0.

Include any warning/errors/backtraces from the system logs

  pool: tank
 state: ONLINE
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 00:00:01 with 9952 errors on Mon Oct 28 14:40:31 2024
config:

    NAME        STATE     READ WRITE CKSUM
    tank        ONLINE       0     0     0
      ram0      ONLINE       0     0 19.5K

errors: 9935 data errors, use '-v' for a list
tonyhutter commented 4 weeks ago

I don't see these same errors when I back the pool with /tmp/file rather than a brd device, so the problem could be brd specific.

robn commented 4 weeks ago

ram0 will be using vdev_disk, a file vdev_file. Try putting a loop over the file; that'll make vdev_disk drive it. If it breaks there, that'll help guide the finger of suspicion... :grimacing:

tonyhutter commented 4 weeks ago

I just tried a file-backed loopback device and saw no errors.

robn commented 4 weeks ago

Well, I hate this :sweat_smile:

If you haven't worked it out in the next few hours, I'll try reproduce and bang on it a bit this afternoon. Hopefully it is only brd, afaik like only three people in the world know about it. I'd rather be sure of that, given the last couple of weeks.

Also stop finding cool weird bugs please!

mcmilk commented 3 weeks ago

When playing with different backing devices and filesystems for the "ZTS with QEMU PR" - I run into this also. With ram0 and with zram0 I hat strange behaviour sometimes. I re-used the loop files... and everything was fine again.