Closed ndokos closed 4 years ago
@ndokos Other things that would help:
write_hist_log/log_hist_msec
necessary to make the issue happen? How about when numjobs=1
? How about when rw=write
?I reduced the job file to the following:
[global]
bs=4k
runtime=60
ioengine=libaio
iodepth=32
direct=0
sync=0
time_based=1
clocksource=clock_gettime
ramp_time=5
write_bw_log=fio
write_iops_log=fio
write_lat_log=fio
log_avg_msec=1000
[job-/home/pbench/tmp/foo]
filename=/home/pbench/tmp/foo
rw=rw
size=4096M
and I tried with 3.19. I still see the same problem. In fact, in one case I saw nine instances of the same timestamp:
...
19141, 159917, 0, 0, 0
19141, 1989, 1, 0, 0
19141, 69412, 0, 0, 0
19141, 14836, 1, 0, 0
19141, 3073, 1, 0, 0
19141, 2857, 1, 0, 0
19141, 3404, 0, 0, 0
19141, 1845, 0, 0, 0
19141, 2950, 1, 0, 0
...
I'll try git master but it'll take me a couple of days to get there. If you think I should simplify the job file further, let me know.
In the meantime, I'd be interested to see if you can reproduce this (on any system).
Thanks!
I do NOT see it when I set rw=write
, except in one instance at the last point:
60000, 198361, 1, 0, 0
60000, 91099, 1, 0, 0
This was in fio_lat.1.log.
Can
direct=0
sync=0
Be removed as I believe those are at their defaults? What about
ramp_time=5
write_bw_log=fio
write_iops_log=fio
? Even
ioengine=libaio
(although you'll get lower speeds)?
Following your suggestions, I cut it down to this:
bs=4k
runtime=60
iodepth=32
time_based=1
clocksource=clock_gettime
write_lat_log=fio
log_avg_msec=1000
[job-/home/pbench/tmp/foo]
filename=/home/pbench/tmp/foo
rw=rw
size=4096M
numjobs=4
I still see things like this:
...
53031, 6349, 0, 0, 0
53031, 8307, 1, 0, 0
53031, 2293, 0, 0, 0
53031, 1452, 1, 0, 0
54032, 8757, 0, 0, 0
54032, 5675, 1, 0, 0
54032, 3844, 1, 0, 0
55000, 6546, 0, 0, 0
55000, 7011, 1, 0, 0
55000, 50660, 0, 0, 0
55000, 2289, 1, 0, 0
56008, 19375, 0, 0, 0
56008, 11927, 1, 0, 0
56075, 67185554, 0, 0, 0
57019, 12128, 0, 0, 0
57019, 12350, 1, 0, 0
57019, 2818, 0, 0, 0
58088, 19140, 0, 0, 0
58088, 24556, 1, 0, 0
58088, 3202, 0, 0, 0
59011, 3997, 0, 0, 0
59011, 11049, 1, 0, 0
59011, 2937, 1, 0, 0
60010, 5572, 0, 0, 0
60010, 6541, 1, 0, 0
Reducing the fio.job further to this:
[global]
bs=4k
runtime=60
iodepth=32
time_based=1
clocksource=clock_gettime
write_lat_log=fio
log_avg_msec=1000
[job-/home/pbench/tmp/foo]
filename=/home/pbench/tmp/foo
rw=rw
I still see the problem:
...18102, 13756, 0, 0, 0
18102, 1589, 1, 0, 0
18102, 10106, 1, 0, 0
21586, 20105, 0, 0, 0
21586, 1624, 1, 0, 0
21586, 32076, 0, 0, 0
21586, 6926, 1, 0, 0
21586, 2088, 1, 0, 0
21586, 3116, 1, 0, 0
21586, 1632, 0, 0, 0
22000, 3871, 0, 0, 0
22000, 1618, 1, 0, 0
22000, 1338, 0, 0, 0
22000, 1491, 1, 0, 0
23002, 3095, 0, 0, 0
23002, 1573, 1, 0, 0
23002, 5464, 1, 0, 0
24000, 14601, 0, 0, 0
24000, 1603, 1, 0, 0
24000, 1284, 0, 0, 0
24000, 1584, 1, 0, 0
25000, 12906, 0, 0, 0
25000, 1613, 1, 0, 0
25000, 1391, 0, 0, 0
...
although this time I don't see the wildly different latencies reported in most (all?) of the other cases above (well, there are still factors of 20, but not factors of 200).
The above is still with fio-3.19, I will try the latest bits from master probably tomorrow.
I did a git bisect and believe this bug was introduced in fio-3.18 by commit b2a432bfbb6d10e93f2c8f8092d6db672d45af0d
Author: Phillip Chen phillip.a.chen@seagate.com Date: Wed Jan 22 16:28:38 2020 -0700
Per-command priority: Priority logging and libaio/io_uring cmdprio_percentage Add cmdprio_percentage option to libaio and io_uring engines to set ioprio on a per-command basis. Add tracking of high priority commands to be displayed separately in human readable and JSON outputs.
With 7a4e480d8875a31bee6f2a39c87c25e276cec616 (the parent of b2a432bfbb6d10e93f2c8f8092d6db672d45af0d), I can run the below command and obtain the below log
$ ./fio-canonical/fio --name=test --time_based --runtime=10s --write_lat_log=fio-06-7a4e48 --log_avg_msec=1000 --size=1G --rw=rw
test: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.17-92-g7a4e4
Starting 1 process
...
$ cat fio-06-7a4e48_clat.1.log
1001, 5625, 0, 0
1001, 2499, 1, 0
2000, 2530, 1, 0
2000, 4498, 0, 0
3000, 2097, 1, 0
3000, 4247, 0, 0
4000, 5989, 0, 0
4000, 1897, 1, 0
5000, 5677, 0, 0
5000, 1891, 1, 0
6000, 6037, 0, 0
6000, 1896, 1, 0
7186, 6343, 0, 0
7186, 1907, 1, 0
8000, 6309, 0, 0
8000, 1921, 1, 0
9000, 5951, 0, 0
9000, 2114, 1, 0
10000, 6604, 0, 0
10000, 1984, 1, 0
But with b2a432bfbb6d10e93f2c8f8092d6db672d45af0d I run the same command and obtain
$ ./fio-canonical/fio --name=test --time_based --runtime=10s --write_lat_log=fio-07-b2a432 --log_avg_msec=1000 --size=1G --rw=rw
test: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.17-93-gb2a4
Starting 1 process
...
$ cat fio-07-b2a432_clat.1.log
1000, 5851, 0, 0, 0
1000, 2551, 1, 0, 0
1000, 5028, 1, 0, 0
2000, 4175, 0, 0, 0
2000, 3214, 1, 0, 0
2000, 60619, 0, 0, 0
3003, 4219, 0, 0, 0
3003, 2644, 1, 0, 0
3003, 63036, 1, 0, 0
4000, 15857, 0, 0, 0
4000, 1923, 1, 0, 0
4000, 5135, 1, 0, 0
5000, 6188, 0, 0, 0
5000, 2123, 1, 0, 0
5000, 780, 0, 0, 0
5000, 1446, 1, 0, 0
6000, 5779, 0, 0, 0
6000, 2014, 1, 0, 0
6000, 6620, 1, 0, 0
7000, 5838, 0, 0, 0
7000, 2017, 1, 0, 0
7000, 780, 0, 0, 0
7000, 1455, 1, 0, 0
8000, 5964, 0, 0, 0
8000, 1904, 1, 0, 0
8000, 905, 0, 0, 0
8000, 1469, 1, 0, 0
9000, 6211, 0, 0, 0
9000, 1879, 1, 0, 0
9000, 1023, 0, 0, 0
10000, 7006, 0, 0, 0
10000, 2171, 1, 0, 0
The only significant change to the logging code in b2a432bfbb6d10e93f2c8f8092d6db672d45af0d is now at https://github.com/axboe/fio/blob/983319d02626347d178b34320ac5835ae7ecad02/stat.c#L2752
@@ -2700,7 +2847,7 @@ static unsigned long add_log_sample(struct thread_data *td,
return diff;
}
- __add_stat_to_log(iolog, ddir, elapsed, td->o.log_max != 0);
+ _add_stat_to_log(iolog, elapsed, td->o.log_max != 0, priority_bit);
iolog->avg_last[ddir] = elapsed - (this_window - iolog->avg_msec);
return iolog->avg_msec;
Reverting _add_stat_to_log
back to __add_stat_to_log
(one underscore back to two underscores) fixes the problem for me but I would welcome more testing:
$ git diff
diff --git a/stat.c b/stat.c
index d8c01d14..efa811d2 100644
--- a/stat.c
+++ b/stat.c
@@ -2749,7 +2749,7 @@ static unsigned long add_log_sample(struct thread_data *td,
return diff;
}
- _add_stat_to_log(iolog, elapsed, td->o.log_max != 0, priority_bit);
+ __add_stat_to_log(iolog, ddir, elapsed, td->o.log_max != 0, priority_bit);
iolog->avg_last[ddir] = elapsed - (this_window - iolog->avg_msec);
return iolog->avg_msec;
$ ./fio-canonical/fio-tipplus --name=test --time_based --runtime=60s --write_lat_log=fio-tipplus --log_avg_msec=1000 --size=1G --rw=rw
test: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.19-14-g9833-dirty
Starting 1 process
...
$ wc fio-tipplus_clat.1.log
120 600 2506 fio-tipplus_clat.1.log
The resulting log file has 120 lines which is what we expect for 60s, one line for reads and one for writes.
Thanks! I will test with this change and report back here.
@vincentkfu: this is a great improvement - it eliminates all duplicate timestamps except for the very last. E.g. in a short 10s run, I see this:
1000, 388968, 1, 0, 0
1000, 369207, 0, 0, 0
2000, 435953, 0, 0, 0
2000, 452103, 1, 0, 0
3362, 526478, 1, 0, 0
3362, 579011, 0, 0, 0
4000, 8460336, 1, 0, 0
4000, 7920513, 0, 0, 0
5000, 717717, 1, 0, 0
5000, 716121, 0, 0, 0
6000, 187822, 1, 0, 0
6000, 187627, 0, 0, 0
7002, 380489, 1, 0, 0
7002, 395400, 0, 0, 0
8489, 335470, 0, 0, 0
8489, 335221, 1, 0, 0
9002, 27359973, 0, 0, 0
9002, 24107737, 1, 0, 0
10004, 2954061, 1, 0, 0
10004, 3010561, 0, 0, 0
10004, 22078571, 0, 0, 0
10004, 21765668, 1, 0, 0
and this is happening in all clat and lat files, but not the slat files.
@sitsofe : should I open an issue for that, reopen this one, or not bother?
I don't see extra entries for the final timestamp in my tests. All the logs have 20 lines in my 10-second runs.
$ ./fio-canonical/fio --name=test --ioengine=null --size=1G --rw=randrw --runtime=10s --time_based --write_lat_log=test --log_avg_msec=1000
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=null, iodepth=1
fio-3.19-17-g6463
Starting 1 process
...
$ wc -l test_clat.1.log
20 test_clat.1.log
$ wc -l test_lat.1.log
20 test_lat.1.log
$ cat test_clat.1.log
1000, 17, 0, 0, 0
1000, 16, 1, 0, 0
2000, 17, 0, 0, 0
2000, 16, 1, 0, 0
3000, 18, 0, 0, 0
3000, 18, 1, 0, 0
4000, 17, 1, 0, 0
4000, 17, 0, 0, 0
5000, 17, 0, 0, 0
5000, 17, 1, 0, 0
6000, 17, 1, 0, 0
6000, 16, 0, 0, 0
7000, 17, 0, 0, 0
7000, 17, 1, 0, 0
8000, 16, 0, 0, 0
8000, 17, 1, 0, 0
9000, 16, 0, 0, 0
9000, 16, 1, 0, 0
10000, 17, 1, 0, 0
10000, 16, 0, 0, 0
This is similar to issue #739 which was closed because of lack of response, so I started there, but it is different enough for @sitsofe to suggest that I open a new one. As a result of that conversation, I redid the run with a different clocksource (clock_gettime) with the results from that presented in this issue. See issue #739 for similar results with clocksource=gettimeofday.
I do a simple run with:
The version (
/usr/local/bin/fio --version
) isfio-3.18
and the job file looks like this:IIUC, the log_avg_msec setting will produce as output the average latency over a 1-sec window, with a timestamp just after the end of the window. And since I'm averaging over one-second windows, I expect the timestamps to be increasing by roughly 1 sec between entries.
But that's not what I see: not only do I get entries that are much closer than 1 sec (see timestamps 38650 and 38659 below), but I also see multiple entries of the same type with identical timestamps. Here's e.g.. a portion of fio_lat.2.log:
In some cases (e.g. for the timestamp 44038) I get two entries, one for read and one for write: that's what I expect. But in other cases (e.g for timestamp 42010) I get three entries: there are two entries for type 1 (with the avg latency off by 2 orders of magnitude between them). In others (e.g. for timestamp 41015), I get four entries, two for type 0 and two for type 1, also with very different average latency values.
I should mention that this happens with all latency logs (lat, clat, slat). This particular run was done on a RHEL8 VM, but I have seen similar results on RHEL7 VMs and a Fedora31 laptop (with an SSD).