cp2k / cp2k

Quantum chemistry and solid state physics software package
https://www.cp2k.org
GNU General Public License v2.0
843 stars 385 forks source link

cp2k process not running on GPU #3458

Open HemanthHaridas opened 4 months ago

HemanthHaridas commented 4 months ago

I am trying to run CP2K on a linux cluster with GPU support. I have successfully complied the code with CUDA support. But the utilization of GPU is zero, even though the program is running , meaning that the code is running on cpu cores.

This is the script that I am using to run cp2k

#!/bin/bash
#SBATCH --job-name=LiCl                 ### Job Name
#SBATCH --output=cp2k.out               ### File in which to store job output
#SBATCH --error=cp2k.err                ### File in which to store job error messages
#SBATCH --time=3-00:00:00               ### Wall clock time limit in Days-HH:MM:SS              
#SBATCH --ntasks=64
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=1

module load gcc cuda/11.8.0 openmpi/4.1.6-gpu intel-oneapi-mkl/2022.0.2

source /cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/setup

export OMP_NUM_THREADS=$SLURM_NTASKS

mpirun -n 1 /cp2k_plumed_gpu/cp2k-2024.1/exe/local_cuda/cp2k.psmp -i colvars.inp -o colvars.out

The local_cuda.psmp file in the arch folder

CC          = /gcc-8.5.0/openmpi-4.1.6-chhfokbbf3fbb2t4uo7ns4ukaripskzj/bin/mpicc
CXX         = /gcc-8.5.0/openmpi-4.1.6-chhfokbbf3fbb2t4uo7ns4ukaripskzj/bin/mpic++
AR          = ar -r
FC          = /gcc-8.5.0/openmpi-4.1.6-chhfokbbf3fbb2t4uo7ns4ukaripskzj/bin/mpifort
LD          = /gcc-8.5.0/openmpi-4.1.6-chhfokbbf3fbb2t4uo7ns4ukaripskzj/bin/mpifort
#
DFLAGS      = -D__OFFLOAD_CUDA -D__DBCSR_ACC  -D__LIBXSMM  -D__parallel -D__MKL -D__FFTW3  -D__SCALAPACK -D__LIBINT -D__LIBXC -D__LIBGRPP -D__GSL -D__PLUMED2 -D__SPGLIB  -D__OFFLOAD_GEMM  -D__SPLA
#
WFLAGS      = -Werror=aliasing -Werror=ampersand -Werror=c-binding-type -Werror=intrinsic-shadow -Werror=intrinsics-std -Werror=line-truncation -Werror=tabs -Werror=target-lifetime -Werror=underflow -Werror=unused-but-set-variable -Werror=unused-variable -Werror=unused-dummy-argument -Werror=unused-parameter -Werror=unused-label -Werror=conversion -Werror=zerotrip -Wno-maybe-uninitialized -Wuninitialized -Wuse-without-only
#
FCDEBFLAGS  = -fbacktrace -ffree-form -fimplicit-none -std=f2008
CFLAGS      = -fno-omit-frame-pointer -fopenmp -g -mtune=native  -O3 -funroll-loops $(PROFOPT)  -I/gcc-8.5.0/openmpi-4.1.6-chhfokbbf3fbb2t4uo7ns4ukaripskzj/include -pthread  -m64 -I/gcc-8.5.0/intel-oneapi-mkl-2023.2.0-etwucm5d3s2qu7eiuaaxastbiukj2ori/mkl/2023.2.0/include -I/gcc-8.5.0/intel-oneapi-mkl-2023.2.0-etwucm5d3s2qu7eiuaaxastbiukj2ori/mkl/2023.2.0/include/fftw -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libint-v2.6.0-cp2k-lmax-5/include' -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libxc-6.2.2/include' -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libgrpp-main-20231215/include' -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libxsmm-1.17/include' -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/gsl-2.7/include' -I/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/spglib-1.16.2/include -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/SpLA-1.5.5/include/spla' -std=c11 -Wall -Wextra -Werror -Wno-vla-parameter -Wno-deprecated-declarations $(DFLAGS) -I/gcc-8.5.0/cuda-11.8.0-3wlxktsbgw2ui4wvdnsy7w7xyxlkkwju/include
FCFLAGS     = -fno-omit-frame-pointer -fopenmp -g -mtune=native  -O3 -funroll-loops $(PROFOPT)  -I/gcc-8.5.0/openmpi-4.1.6-chhfokbbf3fbb2t4uo7ns4ukaripskzj/include -pthread  -m64 -I/gcc-8.5.0/intel-oneapi-mkl-2023.2.0-etwucm5d3s2qu7eiuaaxastbiukj2ori/mkl/2023.2.0/include -I/gcc-8.5.0/intel-oneapi-mkl-2023.2.0-etwucm5d3s2qu7eiuaaxastbiukj2ori/mkl/2023.2.0/include/fftw -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libint-v2.6.0-cp2k-lmax-5/include' -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libxc-6.2.2/include' -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libgrpp-main-20231215/include' -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libxsmm-1.17/include' -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/gsl-2.7/include' -I/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/spglib-1.16.2/include -I'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/SpLA-1.5.5/include/spla' $(FCDEBFLAGS) $(WFLAGS) $(DFLAGS)
CXXFLAGS    = -O2 -fPIC -fno-omit-frame-pointer -fopenmp -g -march=native -mtune=native --std=c++14 $(DFLAGS) -Wno-deprecated-declarations -I/gcc-8.5.0/cuda-11.8.0-3wlxktsbgw2ui4wvdnsy7w7xyxlkkwju/include
#
LDFLAGS     =  $(FCFLAGS) -Wl,--enable-new-dtags -pthread -L/gcc-8.5.0/openmpi-4.1.6-chhfokbbf3fbb2t4uo7ns4ukaripskzj/lib -L/gcc-8.5.0/ucx-1.14.0-cupo7hrn2exqqwfyatdigeuiiqijaulw/lib -L/gcc-8.5.0/zlib-1.2.13-dcpzngybj4fisn6ojapnels3yfwcxqgk/lib -Wl,-rpath -Wl,/gcc-8.5.0/openmpi-4.1.6-chhfokbbf3fbb2t4uo7ns4ukaripskzj/lib -Wl,-rpath -Wl,/gcc-8.5.0/ucx-1.14.0-cupo7hrn2exqqwfyatdigeuiiqijaulw/lib -Wl,-rpath -Wl,/gcc-8.5.0/zlib-1.2.13-dcpzngybj4fisn6ojapnels3yfwcxqgk/lib    -L'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libint-v2.6.0-cp2k-lmax-5/lib' -L'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libxc-6.2.2/lib' -Wl,-rpath,'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libxc-6.2.2/lib' -L'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libgrpp-main-20231215/lib' -Wl,-rpath,'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libgrpp-main-20231215/lib' -L'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libxsmm-1.17/lib' -Wl,-rpath,'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/libxsmm-1.17/lib' -L'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/gsl-2.7/lib' -Wl,-rpath,'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/gsl-2.7/lib' -L'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/plumed-2.9.0/lib' -Wl,-rpath,'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/plumed-2.9.0/lib' -L'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/spglib-1.16.2/lib' -Wl,-rpath,'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/spglib-1.16.2/lib' -L'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/SpLA-1.5.5/lib/cuda' -Wl,-rpath,'/cp2k_plumed_gpu/cp2k-2024.1/tools/toolchain/install/SpLA-1.5.5/lib/cuda' -L'/gcc-8.5.0/cuda-11.8.0-3wlxktsbgw2ui4wvdnsy7w7xyxlkkwju/targets/x86_64-linux/lib' -Wl,-rpath,'/gcc-8.5.0/cuda-11.8.0-3wlxktsbgw2ui4wvdnsy7w7xyxlkkwju/targets/x86_64-linux/lib' -L'/usr/lib64' -Wl,-rpath,'/usr/lib64'
LIBS        = -lspla -lsymspg -l:libplumed.a -ldl -lstdc++ -lz -ldl -lgsl -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2   -lmpi_cxx -lmpi  -L/gcc-8.5.0/intel-oneapi-mkl-2023.2.0-etwucm5d3s2qu7eiuaaxastbiukj2ori/mkl/2023.2.0/lib/intel64 -Wl,-rpath,/gcc-8.5.0/intel-oneapi-mkl-2023.2.0-etwucm5d3s2qu7eiuaaxastbiukj2ori/mkl/2023.2.0/lib/intel64 -lmkl_scalapack_lp64 -Wl,--start-group -lmkl_gf_lp64 -lmkl_sequential -lmkl_core -lmkl_blacs_openmpi_lp64 -Wl,--end-group -lpthread -lm -ldl -lstdc++ -lcudart -lnvrtc -lcuda -lcufft -lcublas -lrt
#
GPUVER        = A100
OFFLOAD_CC    = nvcc
OFFLOAD_FLAGS = -g -arch sm_80 -O3 -allow-unsupported-compiler -Xcompiler='-fopenmp -Wall -Wextra -Werror' --std=c++11 $(DFLAGS)
OFFLOAD_TARGET = cuda
#
FYPPFLAGS   = -n --line-marker-format=gfortran5

Are there any additional flags that I need to use to run the code on GPUs?

fstein93 commented 3 months ago

Please excuse the late answer. Are you sure that 1) you can use mpirun directly or do you have to use something like srun on your system 2) you do not have to set additional environment variables to ask for GPUs?