Open AdelKS opened 4 years ago
Just wanna say that the first patch (amd_cpufreq_patch.diff) is very recommended and should have no drawbacks, but on its own isn't game-changing. The second patch (no_cache_cluster.diff) is extremely hacky, has a lot of performance regressions, and shouldn't be taken seriously.
A perfect patch would add CCX priority support to the scheduler so that the CCX with the fastest core is chosen preferentially if all else is equal, otherwise all the patch achieves is picking the best core on the CCX after the CCX is basically chosen at random.
A few things discovered after posting it:
tristate "AMD CPUFreq driver"
needs to be switched to bool, as amd-cpufreq doesn't compile when compiled as a moduleThank you for the information @syldrathecat, and thanks again for sharing the patches with us!
Just wanna say that the first patch (amd_cpufreq_patch.diff) is very recommended and should have no drawbacks, but on its own isn't game-changing.
If I understand well, and I have nearly no knowledge on the Kernel inner workings, is that the first patch adds the knobs for the schedulers to use. I wanna know now if Project C/BMQ can use it, thus not use the second patch. You have more knowledge than me there. If you think it's possible, I will try to get in touch with Alfred Chen to get his thoughts on the matter.
Support for core priorities in the standard Linux scheduler (CFS) was already added by Intel: https://lore.kernel.org/patchwork/cover/737840/
All the amd_cpufreq addition does is pass the core performance information through that interface (arch/x86/kernel/itmt.c), the exact same way that the intel_pstate driver already does for Intel CPUs.
(Unfortunately, since Intel doesn't have the same cache hierarchy as AMD, they had no reason to add anything that attempts to prioritize CCXs based on which has the fastest core available, which really gimps the potential of the patch until an expert can figure out how to adjust CFS, or any other scheduler, to behave appropriately)
I haven't heard about alternative schedulers doing anything with this information, but I haven't really checked. Since this has been a feature available in the kernel on Intel CPUs since 2017, its likely that they are at least aware that either "preferred cores", or "core priority" are a thing. Key words to look for are:
SD_ASYM_PACKING
, which is the flag that x86 platform code uses to inform the scheduler about a domain which has asymmetric core performance. This triggers CFS to more aggressively attempt to relocate tasks from lower priority to higher priority CPUs.arch_asym_cpu_priority
, which is the platform interface function which returns a numeric CPU priority. CFS compares cores with each-other as a tie breaker when it would otherwise be assigning work to one randomly.The subtle ways in which schedulers operate is kind of beyond me, so someone who is already an expert in how their scheduler assigns work would need to decide what can be done with the information that "Core A could potentially run this task 5% faster than Core B". At the very least, it should pass the basic test that if I run a single-threaded process on an idle system, it finds its way on to the fastest core very quickly and does a good job of staying there; and that the "ld" process left as the final piece of work at the end of a large parallel build job should likewise do the same.
Thank you for the additional information, I reached the maintainer of Project C / BMQ by mail, explained the situation and redirected him here. Maybe he already implemented the use of "fastest" core and need only few extra tweaks to use the first patch. Otherwise I feel like you two can help each other into making something that works :D
I unfortunately don't have any knowledge on the matter but I'd be happy to learn new things and help out!
I am Alfred Chen, thanks for @AdelKS 's email and bring me this topic.
I have tried some itmt related testing in BMQ about 2 years ago. But at that time, none of my machines seems to support it. So, currently, itmt is not yet supported in Project C.
Going into the detail of implementation, unlike CFS, instead of finding cpu one by one, Project C common scheduler core doing such job by searching cpumask level by level. That means it can't use arch_asym_cpu_priority directly, event it is already supported. To support itmt in Project C, a new "prefer cpumask" level or dimension should be added as common infrastructure. It will be also be used for the intel incoming "little-Big" cpu architecture if it is capable to adapted to. Then, platform interface function arch_asym_cpu_priority can be used to create the "prefer cpumask" level or dimension.
You can open a ticket in Project C as a feature request, I would put it into my TODO list, but it won't be done in weeks, but likely in months, as PDS porting and enhancement will take priority. Beside that, I am not sure I have HW which supports itmt, the best likely is the nuc8i7bek. Somehow, I can implement the just work feature code without real HW, but for performance tuning, real HW is a must.
Hello Alfred Chen @cchalpha, thank you very much for the additional information about of Project C regarding ITMT.
You can open a ticket in Project C as a feature request, I would put it into my TODO list, but it won't be done in weeks, but likely in months.
I will do that, and nobody is in a rush :P I would also love to learn more about kernel scheduling, maybe I can actually help out with coding.
PDS porting and enhancement will take priority.
Absolutely, we are all impatiently waiting for your new improved PDS! It has become the reference scheduler for gamers running custom kernels.
Somehow, I can implement the just work feature code without real HW, but for performance tuning, real HW is a must.
Well that would be already very good! I have a Zen 2 and I am more than willing to help out, and I think a lot of people will be willing to help out by running synthetic/real-world benchmarks and report back.
After three years, AMD finally got to it xD https://www.phoronix.com/news/AMD-Preferred-Core-Linux-v2
I just came across this Reddit post about a patch that prioritizes the fastest cores on AMD Zen 2 CPUs, does Project C have it already actually ? If not I will give it a try and report back, I think it would be a very nice addition :D