axboe / fio

Flexible I/O Tester
GNU General Public License v2.0
5.26k stars 1.26k forks source link

ramp_time restarts job? #1320

Open avikivity opened 2 years ago

avikivity commented 2 years ago

Please acknowledge the following before creating a ticket

Description of the bug:

I am using fio to measure read latency for various combinations of read and write workloads (see diskplorer. To to that, I generate job files with rate-limited read and write jobs. Since I don't know the latency, I don't know the read concurrency in advance, and so I set it to 1000 per thread (and I run on thread per vcpu).

The problem is that this generates a large load when the job starts, as it pushes tons of requests to disk. That's fine, fio is learning the disk latency and will adjust concurrency when it does. I set ramp_time=5 to remove this learning period from measurements.

The problem is that I see the same spike after ramp_time elapsed, see vmstat output below. It starts with a spike in bi, settles for 5 seconds (ramp_time), then spikes and settles again. I'd really like to avoid the spike when ramp_time expires!

Note also the spike in procs/r, probably workqueues from the too-high concurrency.

Sample job file: 0037.fio.

I didn't see any evidence in the code that the job is restarted on the transition from TD_RAMP to TD_RUNNING, but something is clearly happening.

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 460590368  33728 847720    0    0 100491 216832 81644 1434664 41 37  8 14  0
 2  0      0 460592704  33728 847720    0    0 38098 242176 152471 275131  1  0 99  0  0
 0  0      0 460592448  33728 847720    0    0 38250 224384 153417 276436  1  0 99  0  0
 1  0      0 460592448  33728 847720    0    0 38119 224384 153126 275687  1  0 99  0  0
 0  0      0 460592512  33728 847720    0    0 38217 224384 153707 276820  1  0 99  0  0
812  0      0 460678144  33728 847720    0    0 61199 232576 161072 303737  1  4 95  0  0
 2  0      0 460888576  33728 847592    0    0 38863 224256 152235 278462  1  1 99  0  0
 1  0      0 460888576  33728 847592    0    0 38266 224384 153190 276809  1  0 99  0  0
 0  0      0 460888576  33728 847592    0    0 38310 224384 153746 276808  1  0 99  0  0
 1  0      0 460888064  33728 847592    0    0 38428 224256 154097 278023  1  0 99  0  0
 0  0      0 460891360  33728 847592    0    0 38230 224384 153005 277122  1  0 99  0  0
 1  0      0 460902560  33728 847592    0    0 38226 224256 153480 277212  1  0 99  0  0
 1  0      0 460902560  33728 847592    0    0 38190 224384 153741 276527  1  0 99  0  0
 2  0      0 460902560  33728 847592    0    0 38223 224256 154076 277180  1  0 99  0  0
 0  0      0 460902560  33728 847592    0    0 38350 224384 154024 277097  1  0 99  0  0
 0  0      0 460902560  33728 847592    0    0 38191 224256 153429 276048  1  0 99  0  0
 1  0      0 460902560  33728 847592    0    0 38235 224384 154030 276920  1  0 99  0  0
 0  0      0 460902560  33728 847592    0    0 38219 224256 153740 276978  1  0 99  0  0
 2  0      0 460902560  33728 847592    0    0 37812 224384 152749 274209  1  0 99  0  0
 2  0      0 460902560  33728 847592    0    0 38299 224384 154229 277121  1  0 99  0  0
 1  0      0 460902560  33728 847592    0    0 38251 224256 154176 276797  1  0 99  0  0
 0  0      0 460902560  33728 847592    0    0 38316 224384 154026 277576  1  0 99  0  0
 1  0      0 460902560  33736 847584    0    0 38201 224328 153716 276666  1  0 99  0  0
 0  0      0 460902336  33736 847592    0    0 38191 224384 152988 276083  1  0 99  0  0
 0  0      0 460902336  33736 847592    0    0 38362 224256 154207 278006  1  0 99  0  0
 4  0      0 460902304  33736 847592    0    0 38253 224384 154268 277444  1  0 99  0  0
 1  0      0 460902304  33736 847592    0    0 38109 224256 154200 276726  1  0 99  0  0
 2  0      0 460902304  33736 847592    0    0 38283 224384 154299 277057  1  0 99  0  0
 3  0      0 460902304  33736 847592    0    0 38071 224256 153125 275599  1  0 99  0  0
 2  0      0 460902304  33736 847592    0    0 38537 224384 154522 278874  1  0 99  0  0
 0  0      0 460902304  33736 847592    0    0 38089 224256 153406 276324  1  0 99  0  0
 2  0      0 460902304  33736 847592    0    0 38015 224384 153100 275747  1  0 99  0  0
 2  0      0 460902304  33736 847592    0    0 38326 224396 154099 277242  1  0 99  0  0
 1  0      0 460908864  34804 847532    0    0 39562 123648 153565 276990  1  0 99  0  0
 0  0      0 460908736  34804 847592    0    0 38165     0 151048 271587  1  0 99  0  0
 1  0      0 490447424  33736 847652    0    0 11484     0 46754 82201  0  1 98  0  0
 0  0      0 518314752  33736 847072    0    0  1079    28 5266 3799  2  1 96  0  0
63  0      0 490278176  33736 847444    0    0     0 175360 18674 26948  3 85 12  0  0

Environment: Ubuntu 21.04

fio version: fio-3.25

Reproduction steps

Run the example fio above, adjust for your cpu count and disk characteristics.

avikivity commented 2 years ago

Here's a more extreme example, where read_iops=10:

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 517080608  37404 1092184    0    0  1813 279424 10488 104071  2 14 83  0  0
 0  0      0 517080608  37404 1092184    0    0     5 448640 7428 14379  0  0 100  0  0
 0  0      0 517080608  37404 1092184    0    0     4 448640 7411 14354  0  0 100  0  0
 1  0      0 517080608  37404 1092184    0    0     6 448768 7472 14463  0  0 100  0  0
 0  0      0 517081600  37404 1092184    0    0     7 448640 7447 14421  0  0 100  0  0
 0  0      0 517081600  37404 1092184    0    0   784 448896 7588 14546  0  0 100  0  0
 0  0      0 517081600  37404 1092184    0    0     6 448768 7414 14344  0  0 100  0  0
 0  0      0 517081600  37404 1092184    0    0     4 448640 7456 14430  0  0 100  0  0
 0  0      0 517081600  37404 1092184    0    0     6 448640 7430 14399  0  0 100  0  0
 0  0      0 517081600  37404 1092184    0    0     6 448768 7393 14349  0  0 100  0  0
 0  0      0 517081600  37404 1092184    0    0     5 448768 7467 14382  0  0 100  0  0
 0  0      0 517081600  37412 1092176    0    0     5 449024 7492 14441  0  0 100  0  0
 0  0      0 517081600  37412 1092176    0    0     5 448640 7425 14364  0  0 100  0  0
 0  0      0 517081600  37412 1092176    0    0     5 448768 7394 14325  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     3 448768 7401 14320  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     3 448768 7475 14431  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     6 448640 7427 14364  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     8 448768 7433 14373  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     5 448768 7416 14336  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     6 448640 7479 14445  0  0 100  0  0
 1  0      0 517081600  37412 1092184    0    0     4 448768 7407 14360  0  0 100  0  0
 1  0      0 517081600  37412 1092184    0    0     4 448768 7399 14328  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     4 448640 7395 14328  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     4 448896 7468 14424  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     9 448640 7430 14388  0  0 100  0  0
 1  0      0 517081600  37412 1092184    0    0     5 448640 7396 14327  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     6 448768 7401 14340  0  0 100  0  0
 1  0      0 517081600  37412 1092184    0    0     3 448768 7460 14419  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     4 448640 7419 14358  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     7 448640 7404 14343  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     7 448768 7427 14352  0  0 100  0  0
 1  0      0 517081600  37412 1092184    0    0     7 448768 7489 14455  0  0 100  0  0
 1  0      0 517081600  37412 1092184    0    0     6 448772 7438 14375  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     3 448640 7397 14317  0  0 100  0  0
 0  0      0 517081600  37412 1092184    0    0     2 448768 7459 14408  0  0 100  0  0
 0  0      0 517980544  37412 1108832    0    0  2147 164372 4894 7121  3  0 97  0  0
63  0      0 487100480  37412 1108120    0    0     0 375552 22368 34689  3 90  6  0  0
63  0      0 460675520  37412 1108120    0    0   568 448896 23612 14902 22 76  2  0  0

Reads start crazy and settle down, then go crazy again after 5 seconds.

avikivity commented 2 years ago
[global]

runtime=30s
time_based=1
ramp_time=5
filename=/dev/nvme1n1
direct=1
group_reporting
ioengine=io_uring
size=0
random_generator=tausworthe64
randrepeat=0
thread

[job(r_idx=0,w_idx=4,write_bw=459345240,r_iops=0)]

stonewall
new_group

readwrite=write
blocksize=131072
iodepth=64
rate=459345240

[job(r_idx=0,w_idx=4,write_bw=459345240,r_iops=0).1]

readwrite=randread
blocksize=512
iodepth=1000
rate_iops=10
rate_process=poisson
vincentkfu commented 2 years ago

I would start looking at time.c:ramp_time_over()

bool ramp_time_over(struct thread_data *td)
{
        if (!td->o.ramp_time || td->ramp_time_over)
                return true;

        if (utime_since_now(&td->epoch) >= td->o.ramp_time) {
                td->ramp_time_over = true;
                reset_all_stats(td);
                reset_io_stats(td);
                td_set_runstate(td, TD_RAMP);

                /*
                 * If we have a parent, the parent isn't doing IO. Hence
                 * the parent never enters do_io(), which will switch us
                 * from RAMP -> RUNNING. Do this manually here.
                 */
                if (parent_update_ramp(td))
                        td_set_runstate(td, TD_RUNNING);

                return true;
        }

        return false;
}

In there are calls to reset_all_stats() and reset_io_stats(). Perhaps counters related to rate limiting are being reset and preserving them would resolve this problem.