axboe / fio

Flexible I/O Tester
GNU General Public License v2.0
5.01k stars 1.23k forks source link

zbd: fix assertion issue during reset zone when using percentage offset #1728

Closed LinXpy closed 1 week ago

LinXpy commented 4 months ago

Please confirm that your commit message(s) follow these guidelines:

zbd: fix assertion issue during reset zone when using percentage offset

Add a new member for structure fio_file to record the new start offset which is ever updated, especially in zbd test, the file offset maybe ever adjusted according to the zone's boundary and bs alignment when using percentage offset value.

Signed-off-by: LinXpy paynefd1222@163.com

vincentkfu commented 4 months ago

@kawasaki Can you take a look?

kawasaki commented 4 months ago

Thanks for the report. I tried the following command line, and observed an assertion failure.

 sudo ./fio --name=w --filename=/dev/nvme0n1 --zonemode=zbd --direct=1 --rw=write --size=2z --offset=1% --loop=2                                                                         
w: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.36-89-g6ed0
Starting 1 process
/dev/nvme0n1: rounded up offset from 10739712 to 12582912
/dev/nvme0n1: rounded down io_size from 6545408 to 4194304
fio: zbd.c:109: zone_lock: Assertion `f->min_zone <= nz && nz < f->max_zone' failed.
fio: pid=102122, got signal=6

w: (groupid=0, jobs=1): err= 0: pid=102122: Tue Feb 27 19:15:58 2024
  lat (usec)   : 250=71.97%, 500=20.70%, 750=7.23%
  lat (msec)   : 2=0.10%
  cpu          : usr=0.00%, sys=0.00%, ctx=0, majf=0, minf=0
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1024,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=0B/s (0B/s), 0B/s-0B/s (0B/s-0B/s), io=0B (0B), run=248-248msec

Disk stats (read/write):
  nvme0n1: ios=125/1024, sectors=4408/8192, merge=0/0, ticks=13/187, in_queue=200, util=46.13%
fio: file hash not empty on exit

I guess this is the issue that @LinXpy faced. If not, please let me know. In general, this assertion failure is observed when the write workload fulfills two conditions:

  1. --offset is not aligned to zone boundaries
  2. the workload is repeated with --loops or --time_based

I'm not sure if such workloads are meaningful. They request repeated writes not at write pointers, which are not expected. @LinXpy, if you have any background to run such workload, please share.

If users really want to run such workloads, I think the fix by @LinXpy is essentially correct. I think it still needs brush up to have better variable name and to reduce changes out of zbd.c. Of note is that the change will add a branch in clear_io_state() and add 64 bit to struct fio_file. I hope this additional overhead is acceptable, but before such discussion, I would like to consider if such workload (repeated write to offset unaligned zone boundaries) is really useful.

LinXpy commented 4 months ago

Hi, @kawasaki, the failure scene you saw seems a little diff from mine, for the background and the reproduce fio cmd, you can refer to the ticket: https://github.com/axboe/fio/issues/1725.

In my test, i used the below cmd, and can 100% reproduce the failure. fio -filename=/dev/nvme0n1 -ioengine=libaio -direct=1 -iodepth=32 -thread -name=fio_name -numjobs=1 -group_reporting -verify_backlog=16777216 -cpus_allowed=1 -rw=write -zonemode=zbd -significant_figures=10 -bs=128K -do_verify=1 -verify_pattern=0x5A -offset=53% -max_open_zones=16 -size=20%

In my test, the pertentage value for offset is also a must factor to reproduce the failure, but my test didn't run multi loops(only 1 loop) and didn't do time_based test, it just want to fill some data area.

kawasaki commented 4 months ago

@LinXpy Thanks for sharing the workload detail. I was able to recreate the assertion failure. I noticed that the workload has verify options. Actually, the verify feature has similar effect as loop or time_based option: ref reset_io_counters(). I disabled the verify option, then I saw the assertion failure disappears. So, this looks a corner case for me: all of zonemode=zbd, percentage offset and verify option are specified, the assertion failure happens. As I noted, percentage offset is not meaningful to zoned block devices. So I suggest not to use the three options together, rather than your fix which adds the overhead.

One lesson I learnt is that the assertion message is not informative. I'm thinking to replace the assertion failure message with more meaningful error message to guide the offset alignment check.

LinXpy commented 3 months ago

@kawasaki , that makes sense, i agree. The percentage offset/size for ZBD device is not so meaningful, but there is possibility of someone will use the percentage offset/size for ZBD device, so if a good meaningful error message to guide the user to use a specific value instead of percentage value to avoid the assertion failure, that'll be very helpful!