Open lukepalmer opened 2 months ago
A variation of this reproduces with 8.0.0-rc2: Unexpected CPU usage is observed with any io-threads setting other than '1', and does not go away if you set io-threads to a large value.
I think I've convinced myself that this is just the io-threads polling in a busy loop under light but non-zero load and is to be expected. I'll plan to make a documentation contribution for that unless someone thinks this is a real problem.
Describe the bug
Running a higher number of sentinels than io-threads causes significant CPU usage on a leader with no application load: in some cases most of a core.
To reproduce
I can trigger this with:
It's not a subtle difference: in the above scenarios if I stop one of the sentinels the leader CPU usage drops to near 0 as expected.
How much CPU is being used by the leader? It depends on the number of IO threads. Rough numbers (it's pretty jittery) on an average virtualized machine as percentage of 1 core, for the 6 sentinel case:
In some of my tests the dropoff back to idle CPU usage happened at io-threads >= 5 instead of >=7 which I haven't quite nailed down yet. However, there is some number of io-threads above which idle usage drops to 0 as expected.
What is the leader doing? Perf shows that the busyness is attributed entirely to (io-threads - 1) theads doing this:
Another odd data point: counterintuitively, increasing the value of 'hz' to 50 or above makes the CPU usage go down significantly, but not to 0 where it should be.
Expected behavior
A leader being followed by any sane number of sentinels and 0 application load should have near-0 CPU usage.
Additional information
MONITOR shows me normal PING and PUBLISH traffic that I would expect from sentinels. INFO shows io_threads_active:0 while unexpected CPU usage is happening Valkey 7.2.6, kernel 6.1.99-1 Happy to collect anything else or to do further debugging with some guidance.