gms-bbg / gamess-issues

GAMESS issue tracking
7 stars 1 forks source link

GPU code in not called in HF calculations with latest versions of GAMESS #26

Open ivr900 opened 4 years ago

ivr900 commented 4 years ago

Hi there!

I want to bring to attentions of developers that something not quite right is happening in the LIBCCHEM part of GAMESS after 2018-09-30 R2 version.That is that performance in HF, CC and MP2 calculations using GAMESS binary built with LIBCCHEM dropped significantly. Moreover, I noticed that GPU is not actually used in HF at all when it was requested, when monitoring job via 'nvidia-smi' functionality. The jobs finish and give correct results but performance in terms of elapsed time is horrible when not a trivial in size system used, as only cpus were used in calculation. Only RI-MP2 part of GPU code seems to be working as expected.

I was buiding GAMESS-LIBCCHEM on my linux workstation (linux64, Ubuntu 18.04LTS) as below

module load intel-ps/2020.1.217
module load intel-mpi/2020.1.217
module load cmake/3.16.2
module load hdf5/1.10.6-intel
module load boost/1.72.0-intel
module load ga/5.7.1-icc-impi
module load cuda/10.2

CC=gcc
CXX=g++
cd $GMS_ROOT/libcchem/rysq
cmake -DBOOST_ROOT=$BOOST_ROOT -DBUILD_HF=1 -DBUILD_RI=1 -DHAVE_CUDA=ON  -DGPU_BOARD=VOLTA .. --debug-output
make -j8 VERBOSE=1
cd ../..
mkdir build; cd $_
cmake -DBOOST_ROOT=$BOOST_ROOT -DEIGEN_ROOT=/apps/eigen/3.3.7/include/eigen3 -DHDF5_ROOT=$HDF5_ROOT -DMATHLIB=MKL -DGA_ROOT=$GA_ROOT -DMPI_ROOT=$I_MPI_ROOT/intel64 -DBUILD_ALL=1 -DGPU_BOARD=VOLTA .. --debug-output
make -j8 VERBOSE=1
make install
cd ../..
make -j8

My "Makefile" was

GMS_PATH = /apps/gamess/2020-06-R1/gamess.intel-impi
GMS_VERSION = 00
GMS_BUILD_PATH = $(GMS_PATH)
GMS_OPENMP = false
GMS_MSUCC = false
GMS_LIBCCHEM = true
GMS_LIBCCHEM_GPU_SUPPORT = true
GMS_LIBXC = false
GMS_CC = icc
GMS_FC = ifort
GMS_DDI_COMM = mpi
include $(GMS_PATH)/Makefile.in

My "install.info" was

#!/bin/csh

#   Compilation configuration for GAMESS
#   Generated on jake
#   Generated at Mon  6 Jul 10:44:52 AEST 2020

#                 GAMESS Paths                     #
setenv GMS_PATH              /apps/gamess/2020-06-R1/gamess.intel-impi
setenv GMS_BUILD_DIR       $GMS_PATH

#                  Machine Type                    #
setenv GMS_TARGET            linux64
setenv GMS_HPC_SYSTEM_TARGET generic

#              FORTRAN Compiler Setup              #
setenv GMS_FORTRAN           ifort
setenv GMS_IFORT_VERNO       19
 Probably 
#         Mathematical Library Setup               #
setenv GMS_MATHLIB           mkl
setenv GMS_MATHLIB_PATH    $MKLROOT/lib/intel64
setenv GMS_MKL_VERNO         12-so
#         parallel message passing model setup
setenv GMS_MPI_LIB            impi
setenv GMS_MPI_PATH           $I_MPI_ROOT/intel64

#     Michigan State University Coupled Cluster    #
#setenv GMS_MSUCC             false

# Please match any manual changes to the           #
# GMS_MSUCC flag in ./Makefile         
# before running make                              #

#         LIBCCHEM CPU/GPU Code Interface          #Probably 
#setenv GMS_LIBCCHEM          true
#setenv GMS_LIBCCHEM_GPU_SUPPORT      true
setenv GMS_CCHEM_HF        true
setenv GMS_CCHEM_MP2        true
setenv GMS_CCHEM_RI        true
setenv GMS_CCHEM_CC        true
setenv GMS_CUDA_BOARD      volta
setenv GMS_CUDA_PATH       $CUDA_ROOT
setenv GMS_HDF5_PATH       $HDF5_ROOT
setenv GMS_LIBCCHEM_LIBINT     false
#setenv GMS_LIBINT_PATH     $LIBINT_ROOT
setenv GMS_LIBCCHEM_LIBS   ""
setenv GMS_EIGEN_PATH     /apps/eigen/3.3.7/include/eigen3
setenv GMS_BOOST_PATH      $BOOST_ROOT
setenv GMS_GA_PATH         $GA_ROOT

#      Intel Xeon Phi Build: none/knc/knl          #
setenv GMS_PHI             none

#         Shared Memory Type: sysv/posix           #
setenv GMS_SHMTYPE         posix

#      GAMESS OpenMP support: true/false           #
#setenv GMS_OPENMP          false

# Please match any manual changes to the           #
# GMS_OPENMP flag in /apps/gamess/2020-06-R1/gamess.intel-impi/Makefile       #Probably 
# before running make                              #

#      GAMESS LibXC library: true/false            #
#setenv GMS_LIBXC           true

# Please match any manual changes to the           #
# GMS_LIBXC flag in /apps/gamess/2020-06-R1/gamess.intel-impi/Makefile        #
# before running make                              #

####################################################
# Added any additional environmental variables or  #
# module loads below if needed.                    #
####################################################
# Capture floating-point exceptions                #
setenv GMS_FPE_FLAGS       ""

I tried to use the same versions of compilers, MPI, GA, Boost and other 3d party dependencies, except CUDA where I can't use -lcublas_device as it was deprecated in CUDA>=10, therefore used cuda/9.2, with version 2018-09-30-R3. The 2018-09-30-R3 version of GAMESS works as expected. But all 2019 and 2020 versions get something wrong in LIBCCHEM code for HF, CC and MP2 calculations, as GPU is not actually used where it ought to.

Kind regards, Ivan Rostov NCI Australia, Canberra

JorgeG94 commented 4 years ago

Hello, can I get the specific commit you are using?

ivr900 commented 4 years ago

I use public versions. Can I get access to development git?

saromleang commented 4 years ago

@ivr900 I will send out an invite and an email.

@JorgeG94 latest GAMESS public release commit info:

# Commit ID
45b4b41d774952a53dc86ce851faae85e580e87e
# Version
v2020.1
ivr900 commented 4 years ago

Sarom, thank you for that. I shall try the development version through. But from my first preliminary sight after uploading, there is no differences in the LIBCCHEM part between the development and latest public versions. I may be wrong though. I am compiling the development version. Then we'll see.

samcom12 commented 4 months ago

Hi @saromleang ,

I recently started with GAMESS and landed in same issue with latest GAMESS release.

I'm using the below versions of dependancies.

spack load gcc@10.3.0
spack load gcc@10.3.0
spack load globalarrays@5.8.2 ^mpich +cuda +slurm
spack load boost@1.85.0 +math+thread %gcc@10
spack load hdf5@1.14.3 +cxx+fortran +hl ^mpich +cuda +slurm %gcc@10
spack load  eigen@3.4.0 %gcc@10
spack load intel-oneapi-mkl@2024.0.0

I appreciate the help.

JorgeG94 commented 4 months ago

Hi @samcom12 the libcchem code has been effectively deprecated in favour of a coming soon upgrade to it. At the moment you'd be better off using the CPU only code or the OMP gpu offloading cappabilities. The libcchem gpu code had seen little to no upgrade in many years and had fallen behind in quality and portability.

samcom12 commented 4 months ago

Thanks @JorgeG94 for the clarification.

Today, I tried using OpenMP offload compiled code. Still I dont see any GPU cards being used during simulation.

Can you help me chosing a sample testcase using openmp offload effectively? Im using below SLURM script.

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=40 ## Max. Cores = 48
#SBATCH --cpus-per-task=1
#SBATCH -p gpu ## gpu/standard
#SBATCH --time=07:00:00
#SBATCH --exclusive
#SBATCH --gres=gpu:2

ulimit -s unlimited
ulimit -c unlimited

source /home/apps/spack/share/spack/setup-env.sh
spack load cuda@11.4 %gcc@10
spack load intel-oneapi-mkl /b2a4omb
spack load mpich +cuda +slurm
spack load  gcc +nvptx

export CUDA_VISIBLE_DEVICES='0,1'

export LD_LIBRARY_PATH=/home/apps/spack/opt/spack/linux-centos7-cascadelake/gcc-10.3.0/mpich-4.2.1-d5fsc72unjuz3byuzzmkjqfm5qbmf4sv/lib:$LD_LIBRARY_PATH

ulimit -s unlimited

time /scratch/samir/gamess_offload/rungms-dev /home/samir/gamess/tests/libcchem/paper/paper-cocaine.inp 00 $SLURM_NTASKS $SLURM_NTASKS_PER_NODE

The install.info is attached FYR. install.info.txt