haskell-numerics / hmatrix

Linear algebra and numerical computation
381 stars 104 forks source link

option for linking to Intel MKL (and perhaps NVBLAS) #196

Open mmaz opened 8 years ago

mmaz commented 8 years ago

Hi, I am on OSX 10.11 and I would like to be able to link to Intel MKL, which is now available with no fees or restrictions, for the performance benefits it offers. For instance, as a small test I compared the results of 52 calls to eig on 52 distinct complex matrices of size 500x500 from one of my datasets, and after linking to MKL, my 2013 laptop (2.8 GHz quad-core i7) computed all the results in 39 seconds, vs 1.4 minutes for the same example when linked to OpenBLAS.

Would you advise the addition of flags to hmatrix.cabal which supports linking to MKL? Similarly, though I have not tested it, what about NVBLAS? From http://bobkonf.de/2015/slides/thielemann.pdf there is a suggestion to explore linking to NVBLAS:

NVBLAS even moves Hmatrix computations to GPU

NVBLAS implements a subset of BLAS, so linking against both is necessary. From http://docs.nvidia.com/cuda/nvblas/index.html#Usage:

To use the NVBLAS Library, the user application must be relinked against NVBLAS in addition to the original CPU Blas (technically only NVBLAS is needed unless some BLAS routines not supported by NVBLAS are used by the application). To be sure that the linker links against the exposed symbols of NVBLAS and not the ones from the CPU Blas, the NVBLAS Library needs to be put before the CPU Blas on the linkage command line.

For my MKL test, following the example in hmatrix-sparse.cabal, I cloned base/hmatrix.cabal and modified the relevant linking section to:

 if os(OSX)
        extra-libraries:       mkl_intel mkl_sequential mkl_core
        extra-lib-dirs: /opt/intel/mkl/lib/
        include-dirs: /opt/intel/mkl/include/

Running otool (similar to ldd) on the binary shows that the MKL libraries were linked:

$ otool -L $(stack path --local-install-root)/bin/myexecutable
...
    @rpath/libmkl_intel_lp64.dylib (compatibility version 0.0.0, current version 0.0.0)
    @rpath/libmkl_sequential.dylib (compatibility version 0.0.0, current version 0.0.0)
    @rpath/libmkl_core.dylib (compatibility version 0.0.0, current version 0.0.0)

But I also had to export DYLD_LIBRARY_PATH=/opt/intel/mkl/lib/ to resolve those paths, so I am not sure what I need to change in hmatrix.cabal to change the relative paths to absolute paths for the MKL libraries. As a newcomer to MKL, I am also uncertain as to how to correctly configure MKL's threading options (via environment variables or from within hmatrix), and what other libraries may need to be linked against (e.g., ilp64 vs lp64, mkl_intel_thread, etc).

amigalemming commented 8 years ago

MKL needs 39 seconds and OpenBLAS only 1.4 seconds for the same computations? Why support MKL then?

mmaz commented 8 years ago

@amigalemming I apologize for my typo! I have edited the issue: the difference is 39 seconds for MKL vs 1.4 minutes for OpenBLAS :)

amigalemming commented 8 years ago

I see. How does ATLAS perform?

TsumiNa commented 7 years ago

@amigalemming I agree with what @mmaz said, Now MKL is free for use(I'm using academic license). In fact other language like Python, R, etc. also chosen MKL as their default numerical computation backend because of performance

Performance Comparison of OpenBLAS* and Intel® Math Kernel Library in R http://www.parallelr.com/r-hpac-benchmark-analysis/