Currently, the thread pool allows too many threads to spin-wait simultaneously, which in some scenarios causes higher CPU usage when similar performance can be achieved with fewer spin-waiters and less CPU. There are several scenarios where the ThreadPool_UnfairSemaphoreSpinLimit config var is used to disable spin-waiting as a result. Part of the problem in some cases is too many thread requests (https://github.com/dotnet/runtime/issues/93028), but there are also other cases where there is too much spin-waiting.
On larger machines, it may be possible to find a reasonably high limit that does not regress performance when the thread pool is heavily loaded, and allows for similar performance at lower loads with lower CPU usage
On smaller machines and containers, hill climbing may substantially increase the active number of worker threads beyond the processor count. Spin-waiting on more threads than the processor count is unlikely to be beneficial.
The limit of simultaneous spin-waiters should be made configurable. There are many scenarios where spin-waiting helps but either the machine or the thread pool is not fully loaded, and tuning for that with the existing ThreadPool_UnfairSemaphoreSpinLimit config var does not yield the best results.
There may be opportunities for automatically tuning the spin-waiting using some types of feedback, for instance spin-waits failing very frequently may indicate that spin-waiting is less beneficial. This kind of heuristic can be challenging to balance though, there can be tradeoffs, and auto-tuning by default may regress some scenarios. Could either keep the heuristic simpler and more conservative to limit the chance of regressions along with an opt-out, or explore a less conservative solution and make it opt-in for folks to try when experiencing high CPU usage due to spin-waiting.
Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.
Issue Details
- Currently, the thread pool allows too many threads to spin-wait simultaneously, which in some scenarios causes higher CPU usage when similar performance can be achieved with fewer spin-waiters
- On larger machines, it may be possible to find a reasonably high limit that does not regress performance when the thread pool is heavily loaded
- On smaller machines and containers, hill climbing may substantially increase the active number of worker threads beyond the processor count. Spin-waiting on more threads than the processor count is unlikely to be beneficial.
- The limit of simultaneous spin-waiters should be made configurable. There are many scenarios where spin-waiting helps but either the machine or the thread pool is not fully loaded, and tuning for that with the existing `ThreadPool_UnfairSemaphoreSpinLimit` config var does not yield the best results.
- There may be opportunities for automatically tuning the spin-waiting using some types of feedback. There can be tradeoffs and auto-tuning by default may regress some scenarios. Making an opt-in version available in the future may be useful for folks to try when experiencing high CPU usage due to spin-waiting.
ThreadPool_UnfairSemaphoreSpinLimit
config var is used to disable spin-waiting as a result. Part of the problem in some cases is too many thread requests (https://github.com/dotnet/runtime/issues/93028), but there are also other cases where there is too much spin-waiting.ThreadPool_UnfairSemaphoreSpinLimit
config var does not yield the best results.