Closed MES-physics closed 1 year ago
Sorry, we need to get some instructions into the documentation.
You have to compile it with MPI (probably just choose the right QUIP_ARCH) and enable ScaLAPACK in make config
. You still have to provide the sparse points in extra files, which can be done running a quick serial run with sparsify_only_no_fit=T
first, using the $QUIP_ROOT/bin/gap_prepare_sparsex_input.py
to create the *.input
files, and adjusting the gap strings in the config to point to the created files. Then it's just a matter of using srun
or mpirun -n
.
You probably don't have to touch the mpi_blocksize*
options anymore, except if you want to experiment. (Note then that the column blocksize has a huge impact on the size of the working array for ScaLAPACK).
Oh Oh.. Now I get errors after "make", after enabling scaLAPACK in the config questions. What do you think?
Making GAP programs
rm -f /home/m/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmpi+openmp/Makefile
cp /home/m/QUIPMPI/QUIP/src/GAP/Makefile /home/m/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmpi+openmp/Makefile
make -C /home/m/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmpi+openmp QUIP_ROOT=/home/m/QUIPMPI/QUIP VPATH=/home/m/QUIPMPI/QUIP/src/GAP -I/home/m/QUIPMPI/QUIP -I/home/m/QUIPMPI/QUIP/arch Programs
make[1]: Entering directory '/home/m/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmpi+openmp'
mpif90 -x f95-cpp-input -ffree-line-length-none -ffree-form -fno-second-underscore -fPIC -fno-realloc-lhs -fopenmp -I/home/m/QUIPMPI/QUIP/src/libAtoms -I/home/m/QUIPMPI/QUIP/src/fox/objs.linux_x86_64_gfortran_openmpi+openmp/finclude -O3 -DGETARG_F2003 -DGETENV_F2003 -DGFORTRAN -DFORTRAN_UNDERSCORE -D_MPI -D'GIT_VERSION="https://github.com/libAtoms/QUIP.git,v0.9.10-1-g2643d91d7-dirty"' -D'GAP_VERSION=1663247896' -D'QUIP_ARCH="linux_x86_64_gfortran_openmpi+openmp"' -D'SIZEOF_FORTRAN_T=2' -DHAVE_GAP -DHAVE_TB -DHAVE_PRECON -DHAVE_QR -DSCALAPACK -DHAVE_CP2K -DDESCRIPTORS_NONCOMMERCIAL -c /home/m/QUIPMPI/QUIP/src/GAP/gap_fit.f95 -o gap_fit.o
mpif90 -o gap_fit gap_fit.o libgapfit.a -L. -lquiputils -lquip_core -lgap -latoms -fopenmp -O3 -L/home/m/QUIPMPI/QUIP/src/fox/objs.linux_x86_64_gfortran_openmpi+openmp/lib -lFoX_sax -lFoX_wxml -lFoX_utils -lFoX_common -lFoX_fsys -llapack -lblas
./libatoms.a(ScaLAPACK.o): In function __scalapack_module_MOD_get_lwork_pdormqr_i32o64': ScaLAPACK.f95:(.text+0x522): undefined reference to
indxg2p'
ScaLAPACK.f95:(.text+0x549): undefined reference to `indxg2p'
ScaLAPACK.f95:(.text+0x57a): undefined reference to numroc_' ScaLAPACK.f95:(.text+0x5b1): undefined reference to
numroc'
ScaLAPACK.f95:(.text+0x64e): undefined reference to `indxg2p'
ScaLAPACK.f95:(.text+0x678): undefined reference to numroc_' ScaLAPACK.f95:(.text+0x68b): undefined reference to
ilcm'
ScaLAPACK.f95:(.text+0x6bc): undefined reference to `numroc'
ScaLAPACK.f95:(.text+0x6df): undefined reference to numroc_' ./libatoms.a(ScaLAPACK.o): In function
scalapack_module_MOD_get_lwork_pdgeqrfi32o64':
ScaLAPACK.f95:(.text+0x80d): undefined reference to `indxg2p'
ScaLAPACK.f95:(.text+0x82f): undefined reference to indxg2p_' ScaLAPACK.f95:(.text+0x866): undefined reference to
numroc'
ScaLAPACK.f95:(.text+0x88e): undefined reference to `numroc'
./libatoms.a(ScaLAPACK.o): In function `scalapack_module_MOD_scalapack_toarray2d':
ScaLAPACK.f95:(.text+0x9ff): undefined reference to `descinit'
ScaLAPACK.f95:(.text+0xb7e): undefined reference to pdgeadd_' ./libatoms.a(ScaLAPACK.o): In function
scalapack_module_MOD_scalapack_toarray1d':
ScaLAPACK.f95:(.text+0xd3d): undefined reference to `descinit'
ScaLAPACK.f95:(.text+0xe99): undefined reference to pdgeadd_' ./libatoms.a(ScaLAPACK.o): In function
scalapack_module_MOD_scalapack_pdtrtrswrapper':
ScaLAPACK.f95:(.text+0x1182): undefined reference to `pdtrtrs'
./libatoms.a(ScaLAPACK.o): In function `scalapack_module_MOD_scalapack_pdormqrwrapper':
ScaLAPACK.f95:(.text+0x1512): undefined reference to `pdormqr'
ScaLAPACK.f95:(.text+0x1754): undefined reference to pdormqr_' ./libatoms.a(ScaLAPACK.o): In function
scalapack_module_MOD_scalapack_pdgeqrfwrapper':
ScaLAPACK.f95:(.text+0x1964): undefined reference to `pdgeqrf'
ScaLAPACK.f95:(.text+0x1a7d): undefined reference to pdgeqrf_' ./libatoms.a(ScaLAPACK.o): In function
scalapack_module_MOD_scalapack_matrix_product_subzzz':
ScaLAPACK.f95:(.text+0x2397): undefined reference to `pzgemm'
./libatoms.a(ScaLAPACK.o): In function `scalapack_module_MOD_scalapack_matrix_product_subddd':
ScaLAPACK.f95:(.text+0x29c4): undefined reference to `pdgemm'
./libatoms.a(ScaLAPACK.o): In function __scalapack_module_MOD_scalapack_diagonalise_gen_c': ScaLAPACK.f95:(.text+0x3379): undefined reference to
pzhegvx'
ScaLAPACK.f95:(.text+0x3735): undefined reference to `pzhegvx'
./libatoms.a(ScaLAPACK.o): In function __scalapack_module_MOD_scalapack_diagonalise_gen_r': ScaLAPACK.f95:(.text+0x4ad6): undefined reference to
pdsygvx'
ScaLAPACK.f95:(.text+0x4e08): undefined reference to `pdsygvx'
./libatoms.a(ScaLAPACK.o): In function __scalapack_module_MOD_scalapack_diagonalise_c': ScaLAPACK.f95:(.text+0x5ecc): undefined reference to
pzheevx'
ScaLAPACK.f95:(.text+0x6271): undefined reference to `pzheevx'
./libatoms.a(ScaLAPACK.o): In function __scalapack_module_MOD_scalapack_diagonalise_r': ScaLAPACK.f95:(.text+0x724f): undefined reference to
pdsyevx'
ScaLAPACK.f95:(.text+0x7576): undefined reference to `pdsyevx'
./libatoms.a(ScaLAPACK.o): In function __scalapack_module_MOD_scalapack_inverse_c': ScaLAPACK.f95:(.text+0x81a1): undefined reference to
numroc'
ScaLAPACK.f95:(.text+0x81c4): undefined reference to `numroc'
ScaLAPACK.f95:(.text+0x820f): undefined reference to numroc_' ScaLAPACK.f95:(.text+0x8259): undefined reference to
pzgetrf'
ScaLAPACK.f95:(.text+0x8343): undefined reference to `pzgetri'
ScaLAPACK.f95:(.text+0x845c): undefined reference to pzgetri_' ScaLAPACK.f95:(.text+0x85b4): undefined reference to
numroc'
ScaLAPACK.f95:(.text+0x860f): undefined reference to `numroc'
./libatoms.a(ScaLAPACK.o): In function __scalapack_module_MOD_scalapack_inverse_r': ScaLAPACK.f95:(.text+0x8c2f): undefined reference to
numroc'
ScaLAPACK.f95:(.text+0x8c52): undefined reference to `numroc'
ScaLAPACK.f95:(.text+0x8c9d): undefined reference to numroc_' ScaLAPACK.f95:(.text+0x8ce7): undefined reference to
pdgetrf'
ScaLAPACK.f95:(.text+0x8dcf): undefined reference to `pdgetri'
ScaLAPACK.f95:(.text+0x8ee8): undefined reference to pdgetri_' ScaLAPACK.f95:(.text+0x9044): undefined reference to
numroc'
ScaLAPACK.f95:(.text+0x909f): undefined reference to `numroc'
./libatoms.a(ScaLAPACK.o): In function __scalapack_module_MOD_scalapack_init_matrix_desc': ScaLAPACK.f95:(.text+0x9b53): undefined reference to
numroc'
ScaLAPACK.f95:(.text+0x9b76): undefined reference to `numroc'
ScaLAPACK.f95:(.text+0x9bc1): undefined reference to descinit_' ./libatoms.a(ScaLAPACK.o): In function
__scalapack_module_MOD_matrix_scalapack_info_coords_local_toglobal':
ScaLAPACK.f95:(.text+0x9c2c): undefined reference to `indxl2g'
ScaLAPACK.f95:(.text+0x9c4b): undefined reference to indxl2g_' ./libatoms.a(ScaLAPACK.o): In function
scalapack_module_MOD_matrix_scalapack_info_coords_global_tolocal':
ScaLAPACK.f95:(.text+0xb97a): undefined reference to `infog2l'
./libatoms.a(ScaLAPACK.o): In function `scalapack_module_MOD_scalapack_finalise':
ScaLAPACK.f95:(.text+0xc9cd): undefined reference to blacs_gridexit_' ./libatoms.a(ScaLAPACK.o): In function
__scalapack_module_MOD_scalapack_initialise':
ScaLAPACK.f95:(.text+0xcbf5): undefined reference to blacs_gridinit_' ScaLAPACK.f95:(.text+0xcc19): undefined reference to
blacsgridinfo'
collect2: error: ld returned 1 exit status
make[1]: [Makefile:96: gap_fit] Error 1
make[1]: Leaving directory '/home/m/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmpi+openmp'
make: [Makefile:197: gap_programs] Error 2
You need to actually link to the scalapack libraries. You can add the relevant flags to MATH_LINKOPTS
or EXTRA_LINKOPTS
in build.${QUIP_ARCH}/Makefile.inc
.
Can you please tell me what are the relevant flags? I see above, -DSCALAPACK ?
Others? thanks, I'm just beginning.
ScaLAPACK is its own library. You need a copy (compatible with your mpi version), you'll typically end up with something like -lscalapack -lblacs
, but the precise names of the libraries will depend on where you get scalapack. I mostly use MKL, which includes scalpack in addition to lapack and blas. Your link line looks like some linux-provided lapack and blas (perhaps openblas - lots of distributions use that), so you need some version of scalapack compiled and installed.
Is this on your own computer or on a cluster (which one)? What operating system do you use?
yes, the single atom energies look fine.
This is on a cluster at my school. ScaLapack is a module I can load, with a specific mpi version to go with it. So now I’m asking them how to link it to my Quip installation. I will try soon.
OK, now admin told me to do this. "If you use the Intel OneAPI module, it will add all these to your path. After logging in, run: module load OneAPI source $SETVARS This will change your compiler to intel and add tools such as mkl and impi to your environment." Then I put the flags into Makefile.inc : -lscalapack -lblacs Then I tried to do $make config , and it says
Makefile:34: *** "You need to define the architecture using the QUIP_ARCH variable. Check out the arch/ subdirectory.". Stop.
I assume this is the architecture I want:
Makefile.linux_x86_64_gfortran_openmpi+openmp
So, how to define it, and where do I put it?
Thanks for any clues.
Looks like that module will change gfortran to intel fortran and (probably) openmpi to intelmpi. You'll need to see if there's some QUIP_ARCH with a corresponding QUIP/arch/Makefile.{QUIP_ARCH}
for that combination, or you can make your own new Makefile.{QUIP_ARCH}
if there is not. Presumably you'll want to use something like QUIP_ARCH=linux_x86_64_ifort_icc_intelmpi+openmp
. There looks like there might be one with avon
in the name that's at least similar to that.
Yes ! When I used the one with "avon" and set the environment as they said above, the make config and make installation worked. Thanks very much. My mistake was trying to load a scalapack module separately, which wasn't a version compatible with the OneAPI intel compiler. Scalapack seems to be included with this one.
BUT ... right after that all seemed to work, I tried to "make install quippy" and got this problem, after it seemed to be doing things for awhile. How to get the correct file? Was something missing? It looks like quippy is not compatible with something? Thanks for advice.
mv _quippy.cpython-39-x86_64-linux-gnu.so quippy/ mv: cannot stat '_quippy.cpython-39-x86_64-linux-gnu.so': No such file or directory make[1]: [Makefile:119: quippy/_quippy.cpython-39-x86_64-linux-gnu.so] Error 1 make[1]: Leaving directory '/home/m/QUIPMPI/QUIP/build/linux_x86_64_ifort_icc_avon_intelmpi' make: [Makefile:230: quippy] Error 2
Before that, a lot of these showed up:
Generating possibly empty wrappers" Maybe empty "_quippy-f2pywrappers.f" Constructing wrapper function "f90wrap_dictionary_add_array_i_a"... f90wrap_dictionary_add_array_i_a(this,key,value,len_bn,[overwrite])
And these types of warnings:
WARNING:f90wrap.transform:removing optional argument mpi_obj due to unsupported derived type type(mpi_context) WARNING:f90wrap.transform:removing optional argument mpi_obj due to unsupported derived type type(mpi_context) WARNING:f90wrap.transform:removing callback routine potential_simple_set_callback WARNING:f90wrap.transform:removing tb_type.tbsys as type type(tbsystem) unsupported WARNING:f90wrap.transform:removing tb_type.evals as type type(tbvector) unsupported WARNING:f90wrap.transform:removing tb_type.e_fillings as type type(tbvector) unsupported WARNING:f90wrap.transform:removing tb_type.f_fillings as type type(tbvector) unsupported WARNING:f90wrap.transform:removing tb_type.eval_f_fillings as type type(tbvector) unsupported WARNING:f90wrap.transform:removing tb_type.evecs as type type(tbmatrix) unsupported WARNING:f90wrap.transform:removing tb_type.dm as type type(tbmatrix) unsupported WARNING:f90wrap.transform:removing tb_type.hdm as type type(tbmatrix) unsupported WARNING:f90wrap.transform:removing tb_type.scaled_evecs as type type(tbmatrix) unsupported WARNING:f90wrap.transform:removing tb_type.mpi as type type(mpi_context) unsupported WARNING:f90wrap.transform:removing tb_type.gf as type type(greensfunctions) unsupported WARNING:f90wrap.transform:removing optional argument kpoints_obj due to unsupported derived type type(kpoints) WARNING:f90wrap.transform:removing optional argument mpi_obj due to unsupported derived type type(mpi_context)
More on the quippy build error, lines prior to last one showing error: Thanks for any advice.
running build running config_cc INFO: unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc INFO: unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src INFO: build_src INFO: building extension "_quippy" sources INFO: f2py options: [] INFO: adding './src.linux-x86_64-3.8/./src.linux-x86_64-3.8/fortranobject.c' to sources. INFO: adding './src.linux-x86_64-3.8/./src.linux-x86_64-3.8' to include_dirs. INFO: adding './src.linux-x86_64-3.8/_quippy-f2pywrappers.f' to sources. INFO: build_src: building npy-pkg config files running build_ext INFO: customize UnixCCompiler INFO: customize UnixCCompiler using build_ext INFO: customize IntelEM64TFCompiler INFO: Found executable /opt/intel/oneapi/mpi/2021.5.0/bin/mpiifort INFO: Found executable /opt/intel/oneapi/compiler/2022.0.1/linux/bin/intel64/ifort INFO: customize IntelEM64TFCompiler using build_ext mv _quippy.cpython-39-x86_64-linux-gnu.so quippy/ mv: cannot stat '_quippy.cpython-39-x86_64-linux-gnu.so': No such file or directory make[1]: [Makefile:119: quippy/_quippy.cpython-39-x86_64-linux-gnu.so] Error 1 make[1]: Leaving directory '/home/m/QUIPMPI/QUIP/build/linux_x86_64_ifort_icc_avon_intelmpi' make: [Makefile:230: quippy] Error 2
You should continue to build quippy without MPI using your previous QUIP_ARCH
setting with OpenMP parallelisation only. MPI quippy builds are neither needed to run gap_fit with MPI nor supported (although it should be possible if there’s a good reason).
Am I supposed to do the "make config" over again? Then it can't find the flags -llapack -lblas. By the way, I built a different python environment for the MPI version, trying to put all this in a new directory, so how should these be set up? Thanks for helping!
Making Programs
rm -f /home/m/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmp/Makefile
cp /home/m/QUIPMPI/QUIP/src/Programs/Makefile /home/m/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmp/Makefile
make -C /home/m/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmp QUIP_ROOT=/home/m/QUIPMPI/QUIP VPATH=/home/m/QUIPMPI/QUIP/src/Programs -I/home/m/QUIPMPI/QUIP -I/home/m/QUIPMPI/QUIP/arch
make[1]: Entering directory '/home/m/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmp'
gfortran -o quip quip.o vacancy_map_mod.o -L. -lquiputils -lquip_core -lgap -latoms -fopenmp -O3 -L/home/m/QUIPMPI/QUIP/src/fox/objs.linux_x86_64_gfortran_openmp/lib -lFoX_sax -lFoX_wxml -lFoX_utils -lFoX_common -lFoX_fsys -llapack -lblas
/usr/bin/ld: cannot find -llapack
/usr/bin/ld: cannot find -lblas
collect2: error: ld returned 1 exit status
make[1]: [Makefile:79: quip] Error 1
make[1]: Leaving directory '/home/m/QUIPMPI/QUIP/build/linux_x86_64_gfortran_openmp'
make: [Makefile:155: Programs] Error 2
OK, I tried that. Am I supposed to do "make config" over again? If I try "make install quippy" without it, it tells me to. I have the MPI version in a new python environment, in a new directory. So how does quippy fit into it? When I tried "make install quippy" after the make config, it says
it cannot find the -llapack and -lblas flags even though I did load gnu10 and openblas modules on the cluster as I did before for the non-MPI version. How should this quippy be available for the MPI version of the gap_fit program?
I don't know if I'm describing all this correctly.
Thanks for helping!
You don’t need to build or install quippy at all for the MPI version of gap_fit. It’s Fortran only.
Oh, thanks, I'll try to move on now!
Hi, I've been reading the conversations from earlier in the year about running gap_fit with MPI capability. Is it done by simply installing the new version available now, and entering the command "mpi_blocksize= ? " in the gap_fit commands,( and requesting my nodes as usual, and loading the MPI module), on my slurm submission? And, also, what blocksizes are recommended relative to numbers of atoms for training? Is it still necessary to do a serial run for the sparse points first? Thanks!