Closed lucasem1 closed 1 year ago
Hi @lucasem1 , How did you compile ELPA? You should use the same tool chain (i.e. OneAPI) for ELPA with ABACUS.
Hi caic99
here there is the head of my config.log for the elpa :
../configure --prefix=/LIBS/INTEL_2020/install/elpa-2021.05.002 FCFLAGS=-O3 -xCORE-AVX512 CFLAGS=-O3 -xCORE-AVX512 --enable-option-checking=fatal SCALAPACK_LDFLAGS=-L/exe_kosmos/intel/compilers_and_libraries_2020.2.254/linux/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_intelmpi_lp64 -lpthread SCALAPACK_FCFLAGS=-I/exe_kosmos/intel/compilers_and_libraries_2020.2.254/linux/mkl/include/intel64/lp64 --enable-avx2 --enable-avx512 FC=mpiifort CC=mpiicc
Do I need to recompile? If yes, how?
Thanks Luca Sementa
Hi @lucasem1 ,
Would you try CC=mpiicc CXX=mpiicpc FC=mpiifort ../configure FCFLAGS="-mkl=cluster"
?
IMHO the ifort compiler seems to have extra support to mkl with -mkl
flag, so we don't need to pass MKL as a ScaLAPACK implementation.
Wiki FYI. Installing ELPA is quite tricky; we are planning to make it optional.
Dear caic99
I did what you said without success. Do you have any other suggestions? The issues always appear in the linking.
Hi @lucasem1 , Would you share your building command for ABACUS?
export CC=mpiicc export CXX=mpiicpc export FC=mpiifort cmake -DCMAKE_INSTALL_PREFIX=/home/.../CODES/INTEL/INTEL_2020/ABACUS/abacus300\ -DCEREAL_INCLUDE_DIR=/home/.../CODES/INTEL/INTEL_2020/ABACUS/cereal/include\ -DELPA_LIBRARY=/home/.../LIB/INTEL_2020/install/elpa-2021.05.002/lib/libelpa.a\ -DELPA_INCLUDE_DIR=/home/.../LIBS/INTEL_2020/install/elpa-2021.05.002/include/elpa-2021.05.002 ..
@lucasem1 Would you try replacing -DELPA_LIBRARY=/home/.../LIB/INTEL_2020/install/elpa-2021.05.002/lib/libelpa.a
with -DELPA_DIR=/home/.../LIB/INTEL_2020/install/elpa-2021.05.002/
?
It gets stuck at the config stage:
-- The CXX compiler identification is Intel 19.1.2.20200623
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /exe_kosmos/intel/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/mpiicpc - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Cereal: /home/.../CODES/INTEL/INTEL_2020/ABACUS/cereal/include
CMake Error at /home.../LIBS/GCC_10.3/install/cmake3232/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find ELPA (missing: ELPA_LIBRARY)
Call Stack (most recent call first):
/home/.../LIBS/GCC_10.3/install/cmake3232/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
modules/FindELPA.cmake:24 (find_package_handle_standard_args)
CMakeLists.txt:58 (find_package)
@lucasem1
We didn't try to use static library libelpa.a
. Would you replace it in your original script with libelpa.so
?
Currently we are working on conda distribution. We will notify you as long as installing by conda is ready.
Good....dynamic linking is ok. Thank you for the advice. I hope you will continue keeping ELPA as the preferred library for diagonalization. I can confirm that when using it with Intel compilers and mkl one can get a huge speed-up.
Just another question. Is it possible to link abacus to the ELPA library compiled with the OPENMP support? (libelpa_openmp.so)
@lucasem1 Feel free to reach us if you encountered further problems.
I can confirm that when using it with Intel compilers and mkl one can get a huge speed-up.
How's the speed up? I have been told that the latest version of MKL actually uses the algorithm of ELPA, yielding a consistent speed, but I have not tried it myself.
Is it possible to link abacus to the ELPA library compiled with the OPENMP support?
ELPA does not have CMake support, so we have to implement it ourselves. Please change this line indicating library name
https://github.com/deepmodeling/abacus-develop/blob/17796957b8676f0dddc6c25a6fd0747c4e16c70e/modules/FindELPA.cmake#L16
to elpa_openmp
.
This is what I get from my tests on the P105_si512_lcao system on 64 Xeon procs:
SCALAPACK_GVX ITER ETOT(eV) EDIFF(eV) DRHO TIME(s) GV1 -5.484310e+04 0.000000e+00 1.588e-01 8.266e+01 GV2 -5.482309e+04 2.000490e+01 1.054e-01 6.376e+01 GV3 -5.482686e+04 -3.768838e+00 8.453e-03 5.912e+01
GENELPA ITER ETOT(eV) EDIFF(eV) DRHO TIME(s) GE1 -5.484310e+04 0.000000e+00 1.588e-01 4.584e+01 GE2 -5.482309e+04 2.000490e+01 1.054e-01 4.553e+01 GE3 -5.482686e+04 -3.768838e+00 8.453e-03 4.569e+01
the speed-up is strongly reduced on 144 procs. Unfortunately, ELPA does not show performance improvement when using OMP_NUM_THREADS=2 (is it possible to fix this?) , at variance with SCALAPACK_GVX whose performance improves of about 10%.
Did you plan to link the code to the ELSI library too? This way one can exploit other algorithms like PEXSI, SLEPC-Sips, BSEPACK, EIGENEXA and so on.
On 144 procs I get this warning: Grid_Technique::init_atoms_on_grid warning : No atom on this sub-FFT-mesh Does this unbalanced loading affect the performance? If yes, how can I fix this?
@lucasem1 Thank you for your sharing!
Unfortunately, ELPA does not show performance improvement when using OMP_NUM_THREADS=2 (is it possible to fix this?)
Is hyperthreading enabled on your platform? It may not further improve performance.
Did you plan to link the code to the ELSI library too?
Yes, but maybe not so fast. Currently we are evaluating the workload and our scheduling.
Does this unbalanced loading affect the performance?
@wenfei-li Could you take a look at this?
Yes, hyperthreading is on (I got 10% speedup with SCALAPACK_GVX)
@wenfei-li Could you take a look at this? The link points to a web page containing many things. Could you please be a bit more precise about where I have to look?
The unbalanced workload definitely affects the performance.
In grid integration, we parallelize the grid points along z axis, so if your system is not uniformly distributed along z axis, the workload will be unbalanced.
There is not easy fix for this problem, unless you want to rewrite the parallelization scheme for grid integration.
Details
Dear Developers I am trying to install ABACUS 3.0.0 by using the intel mpi compilers. I tested both 2018 and 2020 with the GCC10.3 support. The installation fails in the linking step: What am doing wrong? Hereafter, there is a part of what I get on the screen at the end of the compilation.
[ 99%] Linking CXX executable abacus /LIBS/INTEL_2020/install/elpa-2021.05.002/lib/libelpa.a(libelpa_public_la-elpa_api.o): In function
elpa_api_mp_elpa_c_string_': manually_preprocessed_.._src_elpa_api.F90-src_.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x2d): undefined reference to
c_f_pointer_set_scalar' /LIBS/INTEL_2020/install/elpa-2021.05.002/lib/libelpa.a(libelpa_public_la-elpa_api.o): In functionelpa_api_mp_elpa_int_value_to_string_': manually_preprocessed_.._src_elpa_api.F90-src_.libs_libelpa_public_la-elpa_api.o.F90:(.text+0xae): undefined reference to
for_concat' manuallypreprocessed.._src_elpaapi.F90-src.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x106): undefined reference tofor_concat' manually_preprocessed_.._src_elpa_api.F90-src_.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x145): undefined reference to
iso_c_binding_mp_c_associatedptr' manuallypreprocessed.._src_elpaapi.F90-src.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x1bf): undefined reference tofor_write_seq_fmt' manually_preprocessed_.._src_elpa_api.F90-src_.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x1de): undefined reference to
for_write_seq_fmt_xmit' manuallypreprocessed.._src_elpaapi.F90-src.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x28b): undefined reference tofor_concat' manually_preprocessed_.._src_elpa_api.F90-src_.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x2ad): undefined reference to
for_write_seq_fmt_xmit' manuallypreprocessed.._src_elpaapi.F90-src.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x2c9): undefined reference toc_f_pointer_set_scalar' /LIBS/INTEL_2020/install/elpa-2021.05.002/lib/libelpa.a(libelpa_public_la-elpa_api.o): In function
elpa_api_mp_elpa_int_string_tovalue': manuallypreprocessed.._src_elpaapi.F90-src.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x353): undefined reference tofor_concat' manually_preprocessed_.._src_elpa_api.F90-src_.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x397): undefined reference to
for_concat' manuallypreprocessed.._src_elpaapi.F90-src.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x4b6): undefined reference tofor_concat' manually_preprocessed_.._src_elpa_api.F90-src_.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x4f4): undefined reference to
for_write_seq_fmt' /LIBS/INTEL_2020/install/elpa-2021.05.002/lib/libelpa.a(libelpa_public_la-elpa_api.o): In functionelpa_api_mp_elpa_option_cardinality_': manually_preprocessed_.._src_elpa_api.F90-src_.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x560): undefined reference to
for_concat' /LIBS/INTEL_2020/install/elpa-2021.05.002/lib/libelpa.a(libelpa_public_la-elpa_api.o): In functionelpa_api_mp_elpa_option_enumerate_': manually_preprocessed_.._src_elpa_api.F90-src_.libs_libelpa_public_la-elpa_api.o.F90:(.text+0x5d5): undefined reference to
for_concat'....... .... ..... /intel/compilers_and_libraries_2020.2.254/linux/mkl/lib/intel64/libmkl_intel_thread.so: undefined reference to__kmpc_atomic_float4_add' /kosmos/intel/compilers_and_libraries_2020.2.254/linux/mkl/lib/intel64/libmkl_intel_thread.so: undefined reference to
kmpc_atomic_float8_max' /kosmos/intel/compilers_and_libraries_2020.2.254/linux/mkl/lib/intel64/libmkl_intel_thread.so: undefined reference to `kmpc_atomic_fixed4_rd' /kosmos/intel/compilers_and_libraries_2020.2.254/linux/mkl/lib/intel64/libmkl_intel_thread.so: undefined reference to `__kmpc_reduce' collect2: error: ld returned 1 exit status make[2]: [abacus] Error 1 make[1]: [CMakeFiles/abacus.dir/all] Error 2 make: *** [all] Error 2