scylladb / seastar

High performance server-side application framework
http://seastar.io
Apache License 2.0
8.36k stars 1.55k forks source link

Wasted processing time due to nvme interrupts #507

Open avikivity opened 6 years ago

avikivity commented 6 years ago

In these post-meltdown days interrupts are expensive. The default NVMe configuration does not coalesce interrupts, and because NVMe completion rate is typically much faster than task-quota-ms, we'll see an interrupt per completion with no batching.

The following command colaesces up to 10 NVME interrupts for a period of 200 usec:

$ sudo nvme set-feature /dev/nvme0n1 --feature-id 8 --value 522
set-feature:08 (Interrupt Coalescing), value:0x00020a

The lower byte (0x0a) specifies the number of interrupts to coalesce, the upper byte (0x02) the amount of time to coalesc, in units of 100 usec. I verified that it works on my machine:

 0  1      0 17624724 707592 6554924    0    0 29212     0 7865 14509  2  4 75 19  0
 0  1      0 17580412 748492 6562684    0    0 40900     0 8206 15738  3  5 74 19  0
 0  1      0 17538508 782944 6570208    0    0 34452     0 8203 15118  3  5 74 19  0
 0  1      0 17513640 801532 6576968    0    0 18588     0 6379 11206  2  3 75 19  0
 0  1      0 17418400 836392 6635976    0    0 34776 26880 11401 19015  6  7 69 19  0
 1  1      0 17372184 872704 6647284    0    0 36312     0 11340 21723  5  7 71 17  0
 0  1      0 17349952 908576 6632780    0    0 35872     0 11164 19146  6  7 70 18  0
 0  1      0 17295972 951260 6643108    0    0 42684     0 12694 20619  3  5 74 18  0

In the beginning of the run, coalescing was enabled, and towards the end I disabled it (and interrupt rate went up).

avikivity commented 6 years ago

I get results that are inconsistent with the spec from testing. To get 1000 reads/sec, I need to set the time byte to 36, which corresponds to units of ~30usec, not 100usec. Doubling to 72 gives me 500 reads/sec.

$ sudo nvme set-feature /dev/nvme0n1 --feature-id 8 --value 0x24aa
set-feature:08 (Interrupt Coalescing), value:0x0024aa
$ sudo nvme set-feature /dev/nvme0n1 --feature-id 8 --value 0x48aa
set-feature:08 (Interrupt Coalescing), value:0x0048aa
avikivity commented 6 years ago

On another machine, with an enterprise disk, it works as expected. The value of 10 for coalescing time gives me 1000 reads/sec. So my desktop disk probably implements the spec incorrectly.

avikivity commented 5 years ago

Alternative (better) approach is #536, but requires a new kernel.