Open xiaotongnii opened 11 months ago
Nobody reply?OK Now,Cpuinfo should use logical cpu cluster device node to parse cpu cluster rather than use physical cpu cluster!!! We nedd to add a new cpuinfo device node to parse logical cpu cluster,Such as package_leader_id and package_leader_id_list or logical_core_siblings_list 。 https://github.com/pytorch/cpuinfo/blob/76d5e8f5b563daa65340a60fce0e9aec73a936df/src/arm/linux/api.h#L189
We can not use a physical cpu cluster device node to parse directly ,because cpu arch is changing all the time. BR. Xiaotong
I think some of that is being added as part of RISC-V bringup, but I wonder if it could be landed separately...
I think some of that is being added as part of RISC-V bringup, but I wonder if it could be landed separately...
Hi,malfet, what is means about " I wonder if it could be landed separately". For Arm CPUs,I am willing to slove the issue.And RISV-V CPUS also has a similar programe. How to make the issue become a PR?
I suspect we're referring to PR #190 here, where we added the concept of core_leader_id
and cluster_leader_id
to the RISC-V structure definition.
In the RISC-V implementation, we tie the processor's uarch to the core_leader. This seems to be essentially the problem you're flagging on ARM - logical processors on the same physical core are guaranteed to have one uarch, but multiple physical cores tied to the same cluster don't necessarily have to be the same uarch.
I think what malfet@ meant by 'it could be landed separately' is whether #190 might somehow block you, because 2 weeks ago at the comment, it was not yet landed. However, these paths are completely independent, so you could put up a fix for this in ARM without worrying about RISC-V.
In regards to how to make this issue a PR, just create a PR separately and follow this guide to link your PR to this issue.
Application: uint32_t uarch_index = cpuinfo_get_current_uarch_index();-> const struct cpuinfo_uarch_info* cpuinfo_uarch_info = cpuinfo_get_uarch(uarch_index); (cpuinfo_uarchs,init->fill uarchs in it)-> cpuinfo_uarch_info->uarch ;(Type of CPU microarchitecture )
Cpuinfo: sibling_package_leader_id ->package_leader_id -> processors[sibling].package_leader_id
cpuinfo_arm_linux_init->cpuinfo_linux_detect_core_siblings->cluster_siblings_parser
static bool cluster_siblings_parser(
uint32_t processor, uint32_t siblings_start, uint32_t siblings_end,
struct cpuinfo_arm_linux_processor* processors)
{
processors[processor].flags |= CPUINFO_LINUX_FLAG_PACKAGE_CLUSTER;
uint32_t package_leader_id = processors[processor].package_leader_id;
for (uint32_t sibling = siblings_start; sibling < siblings_end; sibling++) {
if (!bitmask_all(processors[sibling].flags, CPUINFO_LINUX_FLAG_VALID)) {
continue;
}
const uint32_t sibling_package_leader_id = processors[sibling].package_leader_id;
if (sibling_package_leader_id < package_leader_id) {
// package_leader_id = sibling_package_leader_id;
}
// processors[sibling].package_leader_id = package_leader_id;
processors[sibling].flags |= CPUINFO_LINUX_FLAG_PACKAGE_CLUSTER;
}
processors[processor].package_leader_id = package_leader_id;
return true;
}
Platform:android_arm64-android13 SOC: cortex A 4a55 and 4a76 Cpuinfo:core_siblings_list:0-7 HW:design 4a55 and 4a76 as a cluster on SOC and configure dtsi cpu 8core as a cluster.
Problem:Use benchmark tools test CPU performance,find cpuinfo parser cpu a76 core as a55 result in serious performance problem on Arm CPU
The cpuinfo call the main function as the following: https://github.com/pytorch/cpuinfo/blob/76d5e8f5b563daa65340a60fce0e9aec73a936df/src/arm/linux/init.c#L34C37-L34C37
Use processors[0].package_leader_id update processors[sibling].package_leader_id(siblings value form 0 to 7). Result in processors[i].package_leader_id value is 0.(i form 0 to 7)
Due to processors[processor].package_leader_id (processor from 0 to 7) is 0, cluster_leader value is 0 all the time. https://github.com/pytorch/cpuinfo/blob/76d5e8f5b563daa65340a60fce0e9aec73a936df/src/arm/linux/init.c#L368
https://github.com/pytorch/cpuinfo/blob/76d5e8f5b563daa65340a60fce0e9aec73a936df/src/arm/linux/init.c#L383 arm_linux_processors[i].uarch = arm_linux_processors[cluster_leader].uarch;
Use arm_linux_processors[0].uarch update arm_linux_processors[i].uarch(i from o0 to 7),Reulst in
arm_linux_processors[i].uarch is the same as update arm_linux_processors[0].uarch which is a55(0x00300355 in the cpuinfo.h)
https://github.com/pytorch/cpuinfo/blob/76d5e8f5b563daa65340a60fce0e9aec73a936df/include/cpuinfo.h#L409
Please give me some advices. BR. Xiaotong