Closed marcelo-alvarez closed 3 years ago
Thanks for highlighting "assuming that the Intel environment and compilers will be used on cori". For portability we should not tie ourselves to a specific compiler suite. Additionally, I think our desiconda default is gcc-based.
Additionally, with Perlmutter AMD CPUs (and Apple M1 laptops...) on the horizon with uncertain MKL support, can we construct this in a way where specex is calling BLAS / LAPACK directly, regardless of whether it is MKL or something else that is providing those routines?
The original specex Makefile got around trying to figure out these details by using the harpconfig
tool installed with HARP. Basically it was using the configure information from HARP to set compilers, lapack linking, boost linking, etc.
With the new cmake build system, you might consider looking for MKL explicitly first (using it if found) and if not found then look for lapack/blas. Here is how we are doing that in another project:
Then, when building with Intel compilers we let it find MKL and use it:
https://github.com/hpc4cmb/toast/blob/master/platforms/cori-intel.sh
But when building with gcc, we pass -DMKL_DISABLED=TRUE
to disable MKL checks and force it to find our other blas / lapack:
https://github.com/hpc4cmb/toast/blob/master/platforms/cori-gcc.sh
The TOAST example cited above indicates that it is possible to have different options for the LAPACK/BLAS library at compile time. I suggest we wait until this branch works with the default desiconda compiler to merge.
The code now compiles as before using the gcc/cray compilers that are the default for desiconda on cori, by linking and including mkl libraries and headers in the standard location, i.e. /opt/intel/mkl.
In addition, the MKLBLAS macro has been added to indicate whether MKL BLAS is being used, i.e. if it is defined then mkl.h
is included. Currently MKL is the only option, but the code is now written in a way where specex is calling BLAS directly, agnostic with respect to the specific BLAS implementation being used.
Thanks. When testing a build at KPNO for generic linux boxes without MKL, I get
python setup.py build_ext --inplace
running build_ext
cmake /n/home/datasystems/users/sjbailey/code/specex -DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/n/home/datasystems/users/sjbailey/code/specex/py/specex -DCMAKE_BUILD_TYPE=Release
-- The C compiler identification is GNU 4.8.5
-- The CXX compiler identification is GNU 4.8.5
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Python: /software/datasystems/desiconda/20200924/conda/bin/python3.8 (found suitable exact version "3.8.3") found components: Interpreter
-- Found PythonInterp: /software/datasystems/desiconda/20200924/conda/bin/python3.8 (found suitable exact version "3.8.3")
-- Found PythonLibs: /software/datasystems/desiconda/20200924/conda/lib/libpython3.8.so (found suitable exact version "3.8.3")
-- Found PythonInterp: /software/datasystems/desiconda/20200924/conda/bin/python3.8 (found version "3.8.3")
-- Found PythonLibs: /software/datasystems/desiconda/20200924/conda/lib/libpython3.8.so
-- Performing Test HAS_CPP14_FLAG
-- Performing Test HAS_CPP14_FLAG - Failed
-- Performing Test HAS_CPP11_FLAG
-- Performing Test HAS_CPP11_FLAG - Success
-- pybind11 v2.2.0
-- Configuring done
-- Generating done
-- Build files have been written to: /n/home/datasystems/users/sjbailey/code/specex/build/temp.linux-x86_64-3.8
cmake --build . --config Release -- -j8
Scanning dependencies of target _libspecex
[ 15%] Building CXX object CMakeFiles/_libspecex.dir/src/specex_psfpy.cc.o
[ 15%] Building CXX object CMakeFiles/_libspecex.dir/src/specex_pyio.cc.o
...
[ 93%] Building CXX object CMakeFiles/_libspecex.dir/src/deboosted/specex_tokens.cc.o
[ 96%] Building C object CMakeFiles/_libspecex.dir/src/specex_blas.c.o
/n/home/datasystems/users/sjbailey/code/specex/src/specex_blas.c:4:17: fatal error: mkl.h: No such file or directory
#include <mkl.h>
^
compilation terminated.
Is that a leftover that should be automagically swapped into place by the cmake system now? I got a similar error when trying to compile on Dirac. Please check and test on a non-NERSC system. Thanks.
It should now build without requiring special options and produce bitwise identical results to the master branch on cori and dirac (following the directions at https://desi.lbl.gov/trac/wiki/Pipeline/GettingStarted/Dirac for the latter). Please check that it builds on cori and dirac before merging, thanks.
Thanks! In the end this PR was more about the build system than the dot product itself, but it looks good. I confirmed that I can compile it on both dirac and cori, and I confirmed that running it on cori produces bitwise identical answers as 0.7.0 used in the Cascades run.
Intel MKL is called for the dot product (via
cblas_ddot
inspecex_mkllinalg.c
as called byspecex_dot
inspecex_linalg.cc
) instead ofboost::numeric::bindings::blas::dot
.Moreover, the compiler flag
-mkl
has been added toCMakeLists.txt
. On cori, this flag is only recognized by the Intel compilers, i.e.icc/icpc
.Local testing, comparing both the
master
andmkldot
branches -- built with theCC=icc
andCXX=icpc
variables set followed with the commandpython setup.py install --prefix ./code
as well as$PWD/code/lib/python3.8/site-packages/specex-0.7.0.dev637-py3.8-linux-x86_64.egg
added toPYTHONPATH
-- gives identical bitwise results for the output fits file.This branch is ready to be merged, assuming that the Intel environment and compilers will be used on cori, e.g. by setting
CC=icc and CXX=icpc
at compile time and having thePrgEnv-intel
environment loaded both at compilation and running.