Open Rogach opened 3 months ago
Do you know a fio version where this was working? I tried 3.36 and see the same issue.
root@localhost:~/fio-dev/1746/fio-canonical# ./fio-3.36 --name=test --ioengine=null --filesize=1T --write_bw_log=test --log_avg_msec=1000 --time_based --runtime=3s --log_compression=128M
test: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=null, iodepth=1
fio-3.36
Starting 1 process
Jobs: 1 (f=1): [R(1)][-.-%][r=14.8GiB/s][r=3868k IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=16213: Mon Apr 8 14:13:48 2024
read: IOPS=3857k, BW=14.7GiB/s (15.8GB/s)(44.1GiB/3000msec)
clat (nsec): min=30, max=16490, avg=33.72, stdev=39.94
lat (nsec): min=60, max=16640, avg=64.57, stdev=55.76
clat percentiles (nsec):
| 1.00th=[ 32], 5.00th=[ 32], 10.00th=[ 32], 20.00th=[ 33],
| 30.00th=[ 33], 40.00th=[ 33], 50.00th=[ 33], 60.00th=[ 34],
| 70.00th=[ 34], 80.00th=[ 34], 90.00th=[ 35], 95.00th=[ 35],
| 99.00th=[ 36], 99.50th=[ 37], 99.90th=[ 125], 99.95th=[ 131],
| 99.99th=[ 137]
bw ( MiB/s): min= 242, max=133333, per=100.00%, avg=119630.13, stdev=4764.08, samples=11570360
iops : min=3798766, max=3876458, avg=3859501.60, stdev=34036.42, samples=5
lat (nsec) : 50=99.89%, 100=0.01%, 250=0.10%, 500=0.01%
lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%
cpu : usr=99.83%, sys=0.13%, ctx=3, majf=0, minf=112
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=11570360,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=14.7GiB/s (15.8GB/s), 14.7GiB/s-14.7GiB/s (15.8GB/s-15.8GB/s), io=44.1GiB (47.4GB), run=3000-3000msec
root@localhost:~/fio-dev/1746/fio-canonical# head test_bw.1.log
999, 122248957, 0, 0, 0
999, 35008547, 0, 0, 0
999, 117028571, 0, 0, 0
999, 117028571, 0, 0, 0
999, 124121212, 0, 0, 0
999, 128000000, 0, 0, 0
999, 124121212, 0, 0, 0
999, 117028571, 0, 0, 0
999, 117028571, 0, 0, 0
999, 124121212, 0, 0, 0
No, I only tested it on the latest version. Do you want me to bisect it?
I think the issue is in iolog.h
:
static inline bool per_unit_log(struct io_log *log)
{
return log && (!log->avg_msec || log->log_gz || log->log_gz_store);
}
Try reverting 836365452f6525a7d670205b823f7982de48f844 and let me know if it works. The only problem is that I'm not sure what issues that patch fixed, so I don't know what harm would be caused by reverting it.
I reverted that commit, and got a segfault :) Seems that patch did have some purpose.
gdb says that the segfault occurs at:
#0 all_sw_idle (wq=0x7f12fbcb7b98) at workqueue.c:76
76 if (!(sw->flags & SW_F_IDLE))
[Current thread is 1 (Thread 0x7f12fbc896c0 (LWP 4449))]
(gdb) list
71 int i;
72
73 for (i = 0; i < wq->max_workers; i++) {
74 struct submit_worker *sw = &wq->workers[i];
75
76 if (!(sw->flags & SW_F_IDLE))
77 return false;
78 }
79
80 return true;
Please acknowledge the following before creating a ticket
Description of the bug: If
log_compression
is specified together withlog_avg_msec
, then bw and iops logs still contain a separate entry for each I/O done (instead of averages as expected).Environment: Arch Linux, 6.7.5-arch1-1
fio version: fio-3.37
Reproduction steps
Run with
fio --filename /dev/sdX test.fio
.Observe lots of non-averaged entries in the bw and iops logs: