Open nihui opened 6 months ago
Just that it cannot tell apart haswell from zen
thanks, interesting project for sure. (though we tend to use cpuinfo&similar only for direct identification of cpu model - I'm not sure if instruction trapping offers an advantage over querying cpu capability registers for instruction set extensions?)
https://github.com/nihui/ruapu?tab=readme-ov-file#features
ruapu is not intended to replace cpuinfo or the register method of obtaining information, but is a complementary detection method. The main purpose is to be used when conventional methods such as cpuinfo cannot be implemented, such as on the windows arm platform, such as detecting risc-v vendor extension, in a unified way
Ruapu currently cannot obtain relevant CPU core architectures, such as skylake zen3 cortex-a75. I plan to complete the cpu isa extension first, and then add other information as needed.
You always need CPUID bits. https://en.wikipedia.org/wiki/FMA_instruction_set#CPUs_with_FMA4
I must admit I am not aware of the situation around Windows on Arm - currently waiting for a CI solution to become available for that platform. But from what I've seen it would probably be sufficient for OpenBLAS to support a generic ARMV8 target, and possibly detect SVE availability (later). Finding out RISC-V extensions, in particular the presence (and version) of vector support, would indeed be a valuable feature where there appears to be only sketchy support depending on device and Linux kernel version
Hello
openblas uses operating system-related methods (parsing /proc/cpuinfo) and architecture-related methods (x86 cpuid) to obtain the isa extension information of the cpu at runtime and dynamically select the optimized code path.
In the neural network acceleration library ncnn ( https://github.com/Tencent/ncnn ), related strategies are also used, but these alone may not be enough to be compatible with more systems and architectures.
Therefore, I recommend integrating ruapu ( https://github.com/nihui/ruapu ) into openblas. Ruapu is a single C header implementation. It uses capture sigill to obtain CPU isa extension support. This is compatible with many operating systems such as linux, windows, macos, and can detect more directly and accurately. Sometimes /proc/cpuinfo or x86 cpuid may lie to us ;)
Comments are welcome, if ruapu is suitable for the project, or if you have any other suggestions