Open joschu opened 9 years ago
(Question from an outsider:) Can't you just link against whatever BLAS library is installed on the system (that will most often be the one that is also used by numpy), and recommend users to install OpenBLAS?
That's what's typically done, e.g. by Numpy and Theano. And maybe CGT should do that, at least for now. A few reasons for the choice to download OpenBLAS:
VECLIB_MAXIMUM_THREADS=1
on mac when CGT is using parallelism. It's easier for the user if we just use a bundled BLAS and let CGT handle the threading configuration.Julia adds a suffix to the names of BLAS (they do it so that 32-bit and 64-bit BLAS routines don't get confused), a smililar patch could be used to fix the problem for CGT, too:
https://github.com/JuliaLang/julia/commit/066825ebb3d450ccd1315122d1fd0e473f91798e
Cool, nice find! It definitely might make sense to customize the makefile, also to make sure we're not building functions we don't need (most of level 3)
A few reasons for the choice to download OpenBLAS:
Agreed, all of these are valid reasons to make OpenBLAS be the default for novice users. If it's not too difficult, advanced users could still be given a way to use an alternative BLAS library, taking care of disabling multi-threading themselves.
Properly configuring a library to find and use the right BLAS is surprisingly hard. NumPy still doesn't do a great job at it--it's rather painful to get NumPy to use your chosen BLAS and lots of people are using the fallback built-in one.
Regarding the last point, at least on Ubuntu (and probably Debian) it's quite simple. When you're using packages from the Ubuntu repository, you can even use update-alternatives
to switch between different installed BLAS libraries to be used by numpy and others, without recompiling anything. But sure, as a library developer it's hard to automatically find the correct BLAS library across multiple platforms.
It definitely might make sense to customize the makefile
What if I want to or need to customize the OpenBLAS build myself, e.g., because the architecture auto-detection does not work for my CPU? Even if you decide to only support OpenBLAS, you may want to allow users to point CGT at an OpenBLAS build somewhere on their system. (This means if you add a function name suffix, that one should be configurable as well.)
(Note that I'm just depicting what I'd like as a user here, without understanding the implications on the development side.)
CGT downloads and installs OpenBLAS. But then when numpy gets imported, the functions from the linked BLAS (e.g. cblas_dgemm) overload some of the functions from OpenBLAS. I noticed this when I found that setting VECLIB_MAXIMUM_THREADS changes the behavior of CGT's matrix multiplication. This behavior doesn't seem to cause any serious bugs, but it partly defeats the purpose of using OpenBLAS, which is to obtain consistent behavior with regard to multithreading and so forth.