axboe / fio

Flexible I/O Tester
GNU General Public License v2.0
5.31k stars 1.26k forks source link

When using fio to perform read and verify data with the time_based option enabled, fio exits after the first verification is completed. #1664

Open Michael-Duan1 opened 1 year ago

Michael-Duan1 commented 1 year ago

Please acknowledge the following before creating a ticket

Description of the bug:

When using fio to perform read and verify data with the time_based option enabled, fio exits after the first verification is completed. Because “ddir_rw_sum(td->bytes_done)“ all is Zero。 image

Environment: Linux fio version: fio-3.36 Reproduction steps

sudo ./fio --ioengine=libaio --bs=128k --rw=write --verify=crc32c --filename=/dev/sda --name=test --size=1g sudo ./fio --ioengine=libaio --bs=128k --time_based --rw=read --verify=crc32c --filename=/dev/sda --runtime=10 --test --size=1g

vincentkfu commented 1 year ago

This seems to only occur with async ioengines. I get the full 10s with --ioengine=psync.

gautammenghani commented 3 weeks ago

This issue is not reproducible with latest master, but is reproducible with 3.36

  1. Latest master with read command from original report
    
    $ ./fio --ioengine=libaio --bs=128k --time_based --rw=read --verify=crc32c --filename=./test --runtime=10 --name=testing --size=1g
    testing: (g=0): rw=read, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=libaio, iodepth=1
    fio-3.38-13-gf241
    Starting 1 process
    Jobs: 1 (f=1): [V(1)][100.0%][r=175MiB/s][r=1398 IOPS][eta 00m:00s]
    testing: (groupid=0, jobs=1): err= 0: pid=45842: Mon Nov 11 19:26:28 2024
    read: IOPS=1281, BW=160MiB/s (168MB/s)(1602MiB/10001msec)
    slat (usec): min=9, max=54117, avg=746.02, stdev=1429.85
    clat (nsec): min=280, max=16782, avg=846.54, stdev=477.16
     lat (usec): min=9, max=54119, avg=746.87, stdev=1430.02
    clat percentiles (nsec):
     |  1.00th=[  310],  5.00th=[  462], 10.00th=[  540], 20.00th=[  572],
     | 30.00th=[  588], 40.00th=[  612], 50.00th=[  700], 60.00th=[  932],
     | 70.00th=[  996], 80.00th=[ 1048], 90.00th=[ 1128], 95.00th=[ 1256],
     | 99.00th=[ 3120], 99.50th=[ 3152], 99.90th=[ 4896], 99.95th=[ 6496],
     | 99.99th=[10688]
    bw (  KiB/s): min=88576, max=179712, per=100.00%, avg=164057.60, stdev=23417.95, samples=20
    iops        : min=  692, max= 1404, avg=1281.70, stdev=182.95, samples=20
    lat (nsec)   : 500=8.22%, 750=42.97%, 1000=19.22%
    lat (usec)   : 2=27.08%, 4=2.39%, 10=0.11%, 20=0.02%
    cpu          : usr=3.31%, sys=7.73%, ctx=6563, majf=0, minf=40
    IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=12817,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs): READ: bw=160MiB/s (168MB/s), 160MiB/s-160MiB/s (168MB/s-168MB/s), io=1602MiB (1680MB), run=10001-10001msec

Disk stats (read/write): sdb: ios=6328/63, sectors=3243520/1120, merge=7/53, ticks=18426/966, in_queue=20021, util=88.13%


2. 3.36 with same command - fio exits prematurely as per original report

$ ./fio --ioengine=libaio --bs=128k --time_based --rw=read --verify=crc32c --filename=./test --runtime=10 --name=testing --size=1g testing: (g=0): rw=read, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=libaio, iodepth=1 fio-3.36 Starting 1 process Jobs: 1 (f=1): [V(1)][60.0%][r=134MiB/s][r=1068 IOPS][eta 00m:04s] testing: (groupid=0, jobs=1): err= 0: pid=49239: Mon Nov 11 19:30:25 2024 read: IOPS=1311, BW=164MiB/s (172MB/s)(1024MiB/6246msec) slat (usec): min=10, max=168011, avg=730.01, stdev=2107.49 clat (nsec): min=411, max=34876, avg=852.18, stdev=593.08 lat (usec): min=11, max=168013, avg=730.86, stdev=2107.60 clat percentiles (nsec): | 1.00th=[ 442], 5.00th=[ 462], 10.00th=[ 564], 20.00th=[ 580], | 30.00th=[ 588], 40.00th=[ 612], 50.00th=[ 788], 60.00th=[ 996], | 70.00th=[ 1048], 80.00th=[ 1096], 90.00th=[ 1160], 95.00th=[ 1240], | 99.00th=[ 1976], 99.50th=[ 2992], 99.90th=[ 3184], 99.95th=[ 5024], | 99.99th=[35072] bw ( KiB/s): min=81408, max=185856, per=99.75%, avg=167466.67, stdev=28744.60, samples=12 iops : min= 636, max= 1452, avg=1308.33, stdev=224.57, samples=12 lat (nsec) : 500=7.26%, 750=42.24%, 1000=10.96% lat (usec) : 2=38.57%, 4=0.90%, 10=0.02%, 50=0.04% cpu : usr=3.49%, sys=7.56%, ctx=4176, majf=0, minf=42 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=8192,0,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs): READ: bw=164MiB/s (172MB/s), 164MiB/s-164MiB/s (172MB/s-172MB/s), io=1024MiB (1074MB), run=6246-6246msec

Disk stats (read/write): sdb: ios=4025/71, sectors=2061312/1120, merge=1/39, ticks=11542/502, in_queue=12349, util=88.42%