flame / blis

BLAS-like Library Instantiation Software Framework
Other
2.28k stars 365 forks source link

DLL drop-in performance 10x lower than openBlas on Octave #675

Open tangjinchuan opened 2 years ago

tangjinchuan commented 2 years ago

Dear developers,

I am using the dll from AMD (https://developer.amd.com/amd-optimizing-cpu-libraries_eula/) to test its compatibility and performance on Windows. The dll name is AOCL-LibBlis-Win-dll.dll which can be found in C:\Program Files\AMD\AOCL-Windows\amd-blis\lib\LP64. I modified this name to libblas.dll and placed it into C:\Program Files\GNU Octave\Octave-7.2.0\mingw64\bin where there was an openBlas version with the same name. I found out that while executing the following Octave command on AMD 5950x, it gives around 10x lower performance compared with openBlas.

a2 = ones(2000,'single'); tic for i = 1:100 a2*a2; end toc

Thank you very much! Best wishes, Jinchuan

tangjinchuan commented 2 years ago

Just discover the problem, which is that a multithread version of dll should be used. The result is indeed faster than openBlas. AOCL-LibBlis-Win-MT-dll.dll https://uk.mathworks.com/matlabcentral/answers/1672304-how-can-i-use-the-blas-and-lapack-implementations-included-in-amd-optimizing-cpu-libraries-aocl-wi libiomp5: https://www.dll-files.com/download/3a7902626cddec83a3da541a96118b46/libiomp5md.dll.html?c=dFZMSXhUYkJWbGEyMEN3bXN0b