Open hborchert opened 1 year ago
As far as I recall it works when the development versions are installed (at least on linux):
conda install mkl-devel
What is the error message when the application crashes? Something like: "Intel MKL FATAL ERROR: Cannot load libmkl_avx512.so.1 or libmkl_def.so.1."?
This issue was also a problem on SeaWulf since the main Python module loads all of anaconda including that annoying MPI.
On Mon, Jun 12, 2023 at 9:51 AM J. S. Kottmann @.***> wrote:
As far as I recall it works when the development versions are installed (at least on linux):
conda install mkl-devel
What is the error message when the application crashes? Something like: "Intel MKL FATAL ERROR: Cannot load libmkl_avx512.so.1 or libmkl_def.so.1."?
— Reply to this email directly, view it on GitHub https://github.com/m-a-d-n-e-s-s/madness/issues/483#issuecomment-1587385245, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABZSAPPIQ2HYX7ZXNSUS54LXK4NHDANCNFSM6AAAAAAZDM3JGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Robert J. Harrison tel: 865-274-8544
No error message (on macOS) at all, application just crashes before reading input with "zsh: abort moldft" for example.
I also confused MPI with MKL ... sorry for that. However, some issues might be related. I think I don't get the MPI problems in conda envs since I usually deactivate it. With MKL there are the following fixes, that might work as well with MPI:
export MKLROOT variable before cmake configuration (needs to be exported on runtime as well)
export MKLROOT=/opt/intel/mkl
path needs to be adapted, depending on where MKL is installed. On clusters it often works to reload the module after loading the anaconda module. Then it often resets the MKLROOT variable.
There is also MPI_ROOT (here with underscore as far as I know) that might do the same trick in this case -- although I assume MPI is more tricky. On clusters I would try to reload the MPI module and hope for the best.
Another way is to set the paths to mpicxx and mpicc manually
cmake -D MPI_C_COMPILER=/path/to/bin/mpicc -D MPI_CXX_COMPILER=/path/to/bin/mpicxx ....
And when running explicitly calling
/path/to/bin/mpirun -n 1 moldft
Installing madness on machine with anaconda installed leads to cmake finding MPI_C/MPI_CXX in anaconda directory and resulting applications crashing upon start, need to deactivate anaconda (conda deactivate env) before installing/running madness applications.