OpenMathLib / OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
http://www.openblas.net
BSD 3-Clause "New" or "Revised" License
6.29k stars 1.49k forks source link

Link to which library confusion #2736

Closed johnduffymsc closed 4 years ago

johnduffymsc commented 4 years ago

Hi

In my Ubuntu 20.04 /usr/lib/aarch64-linux-gnu/openblas-serial directory I have...

-rw-r--r-- 1 root root  8435720 Jun  2 10:55 libblas.so.3
-rw-r--r-- 1 root root 13712616 Jun  2 10:55 liblapack.so.3
-rw-r--r-- 1 root root 14269784 Jun  2 10:55 libopenblas-r0.3.8.so
lrwxrwxrwx 1 root root       21 Jun  2 10:55 libopenblas.so.0 -> libopenblas-r0.3.8.so

What is the difference between the libblas.so.3 and libopenblas-r0.3.8.so? I note they are not the same size.

I have a horrible suspicion that I have been linking to a reference Blas library (the libblas.so.3), through the update-alternatives mechanism, and not OpenBLAS (the libopenblas.so.0 -> libopenblas-r0.3.8.so).

Kind regards

martin-frbg commented 4 years ago

Quite possible - update-alternatives --list blas (and same for "lapack") should tell. Nothing horrible about the reference implementation btw except that performance may be a bit worse.

johnduffymsc commented 4 years ago

Thank you Martin

I made a small tweak to cpuid_arm64.c, upgrading the L2_SIZE from 524,288 to 1,048,576 to reflect the 1MB of L2 cache on my Raspberry Pi 4, rebuilt OpenBLAS, and it made no difference to performance at all. This would make sense if I have been linking to reference Blas.

Out of interest, in your opinion, would a change in L2 size from 0.5MB to 1MB make a noticeable performance improvement in OpenBLAS for benchmarks such as Linpack.

Kind regards

brada4 commented 4 years ago

All cache parameters are per-core, you have 1MB cache for 4 cores, so to avoid spills to main memory you have to halve the parameter to get any improvement, or even greedier if you have other processes like gui on same cores.

You can determine actual imported libraries with a ldd command or more stylish lddtree wrapper

johnduffymsc commented 4 years ago

Thanks, I sort of understand that...

L1_SIZE is per core, OK. L2_SIZE is per core? Are you saying this should be read as "L2_SIZE_DIVIDED_BY_CORE_COUNT"?

Kind regards

brada4 commented 4 years ago

Best place to look for that is x86 tuning where you have specs easily reachable over the webs.

Once here was a discussion to tune down the assumed cache size either to most popular SoC you have or the minimal possible silicon configuration available. Please check tuning it down, and then again tune down 10% from what is in spec (to accommodate context switches behind the scenes) - if you see any improvement. If you get steady even few percent progress - please tell.

It is not divided by something, it pre-dates multi-core CPUs. You dont have universal divisible cache size e.g. for haswell where you have 2-22 cores with .5 - 2.5 MB per-core caches. Then naturally we start thread assuming it rolls with a help of kernel scheduler into a core with particular characteristics. It is a bit simplified description, but explains great part of what happens when you call dgemm_ or so.

johnduffymsc commented 4 years ago

Thank you, I will do some further reading as you suggest.

Kind regards

brada4 commented 4 years ago

If you manage some measurements with numbers in regard to tuning cache parameters please share, even pull request will go well

johnduffymsc commented 4 years ago

Will do!

I'm running some Linpack benchmarks with the Ubuntu packaged OpenBLAS at the moment. These take some time... When complete I'm going to do the same with the L2 cache tweak. I will report back, a week or so.


From: Andrew notifications@github.com Sent: 26 July 2020 09:05 To: xianyi/OpenBLAS OpenBLAS@noreply.github.com Cc: Duffy, John john.duffy.19@ucl.ac.uk; State change state_change@noreply.github.com Subject: Re: [xianyi/OpenBLAS] Link to which library confusion (#2736)

If you manage some measurements with numbers in regard to tuning cache parameters please share, even pull request will go well

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHubhttps://github.com/xianyi/OpenBLAS/issues/2736#issuecomment-663956166, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANOJ5VXA3KP7ZQQ7OKKCR63R5PPWFANCNFSM4PHK74KA.