Closed Msiavashi closed 2 years ago
Hi,
From chapter 17.19.7.3 "Memory Bandwidth Allocation Usage Considerations" of the Intel Software Developers Manual (volume 3):
As control is provided per processor core (the max of the delay values of the per-thread CLOS applied to the core) care should be taking in scheduling threads so as to not inadvertently place a high-priority thread (with zero intended MBA throttling) next to a low-priority thread (with MBA throttling intended), which would lead to inadvertent throttling of the high-priority thread.
As stated, both sibling threads will be throttled so this case should be avoided if possible.
Regards, Marcel
Hi,
From chapter 17.19.7.3 "Memory Bandwidth Allocation Usage Considerations" of the Intel Software Developers Manual (volume 3):
As control is provided per processor core (the max of the delay values of the per-thread CLOS applied to the core) care should be taking in scheduling threads so as to not inadvertently place a high-priority thread (with zero intended MBA throttling) next to a low-priority thread (with MBA throttling intended), which would lead to inadvertent throttling of the high-priority thread.
As stated, both sibling threads will be throttled so this case should be avoided if possible.
Regards, Marcel
Thanks, Marcel.
I was missing that indeed.
So the solution is either not to schedule a high-priority and a low-priority task on sibling hyper threads or disabling the hyper-threading entirely, am I correct?
Is there such a limitation with CAT too?
So the solution is either not to schedule a high-priority and a low-priority task on sibling hyper threads or disabling the hyper-threading entirely, am I correct?
Yes, that is correct.
Is there such a limitation with CAT too?
No, CAT works on a per-thread basis so can be used with sibling threads on the same core.
I have a two-socket Xeon Gold 6142 on which core id 1 and 33 are two sibling hyper threads sharing the same physical core.
When I throttle the bandwidth to maximum (10%) on core 1, and execute a benchmark on both hyper threads, the execution time of the benchmark running on core 33 (sibling hyper thread) increases drastically (~9x). While the same benchmark executes ~9x faster when core 1 is idle but throttled.
Why there is such a significant performance degradation?
Please note that I'm also setting CAT but it doesn't affect the performance as much as throttling does.
Here is the output of
pqos -s
: