scylladb / seastar

High performance server-side application framework
http://seastar.io
Apache License 2.0
8.14k stars 1.53k forks source link

Cross-shard preemption is needed #1430

Open xemul opened 1 year ago

xemul commented 1 year ago

As seen in, e.g. scylladb/scylla#12562, high-prio class with low-concurrency workload is in trouble.

As a result, the high-prio request happens in the middle of the global token queue and have chances to get dispatched only after all the preceding other shards complete their IO

Can be reproduced with io_tester: io_tester --conf jobs.yaml --storage /dev/null --duration 5 and jobs.yaml being

- name: big_writes
  shards: all
  type: seqwrite
  data_size: 1GB
  shard_info:
    rps: 8000
    parallelism: 2
    reqsize: 128kB
    shares: 80

- name: latency_reads
  shards: all
  type: randread
  data_size: 1GB
  shard_info:
    rps: 250
    parallelism: 1
    reqsize: 512
    shares: 1000
  options:
    pause_distribution: poisson
avikivity commented 1 year ago

I think reducing the group sizes will help, and also tightening the scheduler's idea of what is an idle class.

xemul commented 1 year ago

I think reducing the group sizes will help,

Agree

and also tightening the scheduler's idea of what is an idle class.

Maybe tightening the preemption criteria? Because "idle class" notion is as simple as "zero requests in it".

...

While I was writing the above question I got an idea. We could inject "empty" requests into the global token queue for classes that are idling for too little (i.e. -- had a request dispatched recently and didn't yet have a new one queued). So that when there pops up a request for real, it could "preempt" that empty one thus getting closer to the dispatcheable future.

avikivity commented 1 year ago

I think reducing the group sizes will help,

Agree

and also tightening the scheduler's idea of what is an idle class.

Maybe tightening the preemption criteria? Because "idle class" notion is as simple as "zero requests in it".

Zero requests for some time?

...

While I was writing the above question I got an idea. We could inject "empty" requests into the global token queue for classes that are idling for too little (i.e. -- had a request dispatched recently and didn't yet have a new one queued). So that when there pops up a request for real, it could "preempt" that empty one thus getting closer to the dispatcheable future.

Aha. A sort of reservation.