Closed zhihaoy closed 8 months ago
Ref #3736 which is objectively different but spirituality related in the sense /proc
diverge on CPU core/thread details (I gather you are having the "opposite" problem).
This issue has been automatically closed since it has not had any activity for the past year. If you're still experiencing this issue please re-file this as a new issue or feature request.
Thank you!
Please fill out the below information:
ver
at a Windows Command Prompt)Microsoft Windows [Version 10.0.17134.590]
My machine has Intel i7 8750H, a 6 core CPU, but when using NumPy or simple OpenBLAS programs, they attempt to use 12 threads. The performance is horrible.
12 is the number of hyper-threads, but for numeric software it has no use, therefore they avoid schedule threads on the same physical core. But in WSL OpenBLAS is not doing that because WSL's
/sys/devices/system/cpu/cpu%d/cache/index%d/shared_cpu_map
is missing; the cpu folders are incomplete, and OpenBLAS is checking for this https://github.com/xianyi/OpenBLAS/blob/ce3651516f12079f3ca2418aa85b9ad571c3a391/driver/others/init.c#L112setenv OPENBLAS_NUM_THREADS 6
or lower gets reasonable performance.