Open Seanistolen opened 4 years ago
@Seanistolen:
Can you cut the job file down to the bare minimum that still reproduces the problem (i.e. remove every line that doesn't have an impact, make every number as small as possible, use a single file etc.) and post the reduced job file here?
finally I found the impact, it's do_verify=0. without this parameter, the job finished as expected. the minimum workload is 👇 [global] filename=/dev/nvme0n1 ioengine=libaio [precondition] size=100% loops=2
rw=write iodepth=8 bs=128k stonewall
So enabling do_verify=0
means that it doesn't do two loops of I/O? Just to check /dev/nvme0n1
is definitely a device and not a "regular" file right? I'd also be surprised if stonewall was necessary in the above. Just to double check there's nothing related to fio mentioned in dmesg
?
Sorry, stonewall is unnecessary, I was not sure whether loops should be combined with stonewall or not, then I removed stonewall and try again, the answer is:
I can reproduce this with the following:
% dd if=/dev/zero of=loop-doverify bs=16k count=1; fio --filename=loop-doverify --size=100% --loops=2 --bs=4k --stonewall --name=normal --gtod_reduce=1 --name=doverify0 --do_verify=0
1+0 records in
1+0 records out
16384 bytes transferred in 0.000079 secs (207611712 bytes/sec)
normal: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
doverify0: (g=1): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.23-31-gf0d5e-dirty
Starting 2 processes
normal: (groupid=0, jobs=1): err= 0: pid=11179: Sat Oct 24 09:27:01 2020
read: IOPS=8000, BW=31.2MiB/s (32.8MB/s)(32.0KiB/1msec)
cpu : usr=0.00%, sys=0.00%, ctx=0, majf=0, minf=28
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=8,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
doverify0: (groupid=1, jobs=1): err= 0: pid=11180: Sat Oct 24 09:27:01 2020
read: IOPS=4000, BW=15.6MiB/s (16.4MB/s)(16.0KiB/1msec)
clat (nsec): min=3122, max=46145, avg=14233.50, stdev=21277.67
lat (nsec): min=3254, max=47278, avg=14616.75, stdev=21777.45
clat percentiles (nsec):
| 1.00th=[ 3120], 5.00th=[ 3120], 10.00th=[ 3120], 20.00th=[ 3120],
| 30.00th=[ 3632], 40.00th=[ 3632], 50.00th=[ 3632], 60.00th=[ 4048],
| 70.00th=[ 4048], 80.00th=[46336], 90.00th=[46336], 95.00th=[46336],
| 99.00th=[46336], 99.50th=[46336], 99.90th=[46336], 99.95th=[46336],
| 99.99th=[46336]
lat (usec) : 4=50.00%, 10=25.00%, 50=25.00%
cpu : usr=0.00%, sys=0.00%, ctx=0, majf=0, minf=34
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=4,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=31.2MiB/s (32.8MB/s), 31.2MiB/s-31.2MiB/s (32.8MB/s-32.8MB/s), io=32.0KiB (32.8kB), run=1-1msec
Run status group 1 (all jobs):
READ: bw=15.6MiB/s (16.4MB/s), 15.6MiB/s-15.6MiB/s (16.4MB/s-16.4MB/s), io=16.0KiB (16.4kB), run=1-1msec
First job does twice as much I/O (e.g. see issued rwts
) as the second job.
Im trying to do sequential write on my drive for 2 time with the parameter "loops=", but it seems the job would exit with no error on half.
here is my script:
but when I set size=10G or size=99%, the loops just worked well.