Closed nouiz closed 11 years ago
Good to know, thanks for sharing. I just spent quite a bit of time getting the new GPU AMI out using ATLAS (see https://github.com/jtriley/StarCluster/issues#issue/9) so I'll probably wait on this. I'll give it a shot in my spare time and will consider replacing ATLAS with GOTO in the future if everything goes well.
just curious, have you tried linking numpy/scipy with gotoblas/lapack?
No,
But when I do, I will give you the receipe I received.
@nouiz did you ever try linking numpy/scipy with gotoblas/lapack? I'm going to be creating new 11.04 AMIs soon and am curious if gotoblas is the way to go this time? No worries if not.
Hi,
Now there is openblas[1] that is one continuation of the BSD gotoblas. It include a few installation and code bug fix. So I would recommend this version.
To my knowledge there is no release file. You must checkout the git repo. There is tag for release, so just update to the last release tag to have it.
I have tried rapidly to use them, but failed... I think what you could try is to use a modified version of this in the site.cfg file:
echo "
[atlas] library_dirs = /opt/lisa/byhost/atlas/lib include_dirs = /opt/lisa/byhost/atlas/include atlas_libs = f77blas, cblas, atlas" > site.cfg
When I tried to manually add the numpy/scipy blas/lapack section in site.cfg, it seamed to have less priority then the automatically detected atlas that I didn't wanted to use.... So triing the atlas section could solve this, but I didn't tried it.
[1] https://github.com/xianyi/OpenBLAS
If you succed, it would be great to send a small email on the numpy/scipy mailing list... I think many people would love this.
Last time I experimented with this ages ago it did not go well. I have not tried the new OpenBLAS and do not plan to given that Ubuntu now makes it easy to recompile atlas for optimizations and create a nice deb from it. Not to mention that ATLAS actually works and is tested with NumPy/SciPy. Closing unless someone else tries it, tests it, and can confirm this is worthwhile doing.
I belive @nouiz has done more recent testing of OpenBlas on UBUNTU, given discussion on theano. Fred, is that correct?
Now there is an openblas package since Ubuntu 11.10. So you don't need to compile it, you just need to link again it.
I think that it is compiled like MKL: The dynamic lib is optimized for many computers and at load time, it select the best one.
Here is the Theano installation for it:https://deeplearning.net/software/theano/install_ubuntu.html#install-ubuntu
For openblas only: sudo apt-get install libopenblas-dev
After receiving your notes I decided to give NumPy w/ OpenBLAS another shot and I got it to work! I used the following site.cfg:
[blas]
include_dirs = /openblas/prefix/include
library_dirs = /openblas/prefix/lib
blas_libs = openblas
[lapack]
include_dirs = /openblas/prefix/include
library_dirs = /openblas/prefix/lib
lapack_libs = openblas
Unfortunately I also needed one other hack in order to get NumPy's _dotblas.so module to build which I luckily discovered from looking at the Gentoo Linux ebuild for NumPy 1.6.2 (Thank youuuu Gentoo :D):
# make sure _dotblas.so gets built
$ cd $NUMPY_SRC
$ sed -i -e '/NO_ATLAS_INFO/,+1d' numpy/core/setup.py
After installing the site.cfg and running the sed magic from Gentoo I was able to use numpy.dot with all cores utilized (square matrices, N=6000) and numpy.test() passed successfully. I haven't tried with SciPy but this looks very promising.
Next I tried launching a Ubuntu 12.04 AMI and installing libopenblas-dev and python-numpy and used update-alternatives to switch the blas implementation to OpenBLAS. This worked, however, the OpenBLAS install is limited to only two threads given that the machine the package was originally built on likely only had 2 cores. So, unfortunately the Ubuntu OpenBLAS package still needs rebuilding. It needs to have max threads set to something higher (64?) and also use the dynamic target in order to accomodate all supported platforms and take advantage of all cores on the system.
I'll do some more testing with SciPy and report back. If all goes well I'm seriously considering swapping ATLAS with OpenBLAS unless others can provide valid reasons not to.
If I could figure out a way to build ATLAS so that it can run on multiple platforms and still take advantage of multiple cores then I'd likely just stick with ATLAS. OpenBLAS was incredibly easy to configure and build to run on multiple platforms and utilize multiple cores. That combined with comparable performance is why I'm leaning towards OpenBLAS.
Scratch that, the Ubuntu build does support all platforms but still has the max threads (2) limitation. Unfortunately setting both OPENBLAS_NUM_THREADS and OMP_NUM_THREADS doesn't have any effect either...
I was also able to build SciPy successfully against OpenBLAS and tested using hessenberg function. Looks like all but one or two tests pass but this seems normal for every SciPy build I've ever encountered :P
For reference, here is a related post from sklearn: https://github.com/scikit-learn/scikit-learn/pull/766
@npinto Fortunately in the end I was able to use the Ubuntu packages for NumPy/SciPy without needing to rebuild them with the NO_ATLAS_INFO magic so I don't think that post is a problem here (although I had to use that magic when building OpenBLAS/NumPy in a virtualenv on Gentoo). I simply rebuilt Ubuntu's OpenBLAS package and updated the blas implementation to point to OpenBLAS via update-alternatives. The details are here (see main()):
https://github.com/jtriley/StarCluster/blob/ubuntu-12.04-sc-ami-builder/utils/scimage.py
Do you see any issues? (btw still need to automate the update-alternatives commands in scimage.py)
OpenBLAS is now included in the latest 12.04 StarCluster AMIs and replaces ATLAS. See:
$ starcluster listpublic
I just learned that goto have been released under BSD[1]. It is faster then ATLAS in many case[2]
So if you have time you could replace ATLAS for it.
[1]http://www.tacc.utexas.edu/tacc-projects/gotoblas2/ [2]http://dirk.eddelbuettel.com/blog/2010/09/15/#gcbd_0.2.2
p.s. I don't use starcluster for now, but when I have time, I will look more into it. So this is not a real feature request, but more a suggestion.