dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.16k stars 4.71k forks source link

Reduce tendency of the threadpool to consume extra CPU resources when it is not helpful. #80983

Open VSadov opened 1 year ago

VSadov commented 1 year ago

We tend to saturate the machine and take 100% CPU even for scenarios when 20% CPU would be sufficient.

The reason for the CPU consumption is spinning as a part of spinwaiting or conflict resolution backoffs. The most noticeable case of spinning is the one done in the threadpool as a part of a guarantee that any incoming task will be picked up by a worker.

As worker threads could block, we do not consider that workers already executing tasks will pick the new tasks reliably, thus the threadpool ensures that there is an outstanding thread request after a task is enqueued. In scenarios not requiring 100% CPU, such request is quickly satisfied. On the other hand in a steady state the threadpool is nearly empty (since workers are keeping up with tasks), thus many threads will find no work and leave, only to be invited back again. We will have a few lucky workers working and the rest bouncing between the task queue and the threadpool semaphore. The constant churn of threads between the task queue and the semaphore is at best wasteful. We often find ourselves at 100% utilization even when incoming tasks require much less.

We need something more intelligent here, but this is a nontrivial problem. There is a lot of concerns that would need to be considered – worker blocking, starvation, workitem latency, …

A better detection of blocked threads could be the key to having more options regarding thread wakeups as that could relax the requirements on ensuring a thread wake for every incoming task.

ghost commented 1 year ago

Tagging subscribers to this area: @mangod9 See info in area-owners.md if you want to be subscribed.

Issue Details
We tend to saturate the machine and take 100% CPU even for scenarios when 20% CPU would be sufficient. The reason for the CPU consumption is spinning as a part of spinwaiting or conflict resolution backoffs. The most noticeable case of spinning is the one done in the threadpool as a part of a guarantee that any incoming task will be picked up by a worker. As worker threads could block, we do not consider that workers already executing tasks will pick the new tasks reliably, thus the threadpool ensures that there is an outstanding thread request after a task is enqueued. In scenarios not requiring 100% CPU, such request is quickly satisfied. On the other hand in a steady state the threadpool is nearly empty (since workers are keeping up with tasks), thus many threads will find no work and leave, only to be invited back again. We will have a few lucky workers working and the rest bouncing between the task queue and the threadpool semaphore. The constant churn of threads between the task queue and the semaphore is at best wasteful. We often find ourselves at 100% utilization even when incoming tasks require much less. We need something more intelligent here, but this is a nontrivial problem. There is a lot of concerns that would need to be considered – worker blocking, starvation, workitem latency, … A better detection of blocked threads could be the key to having more options regarding thread wakeups as that could relax the requirements on ensuring a thread wake for every incoming task.
Author: VSadov
Assignees: -
Labels: `area-System.Threading`, `untriaged`
Milestone: -