shankar1729 / jdftx

JDFTx: software for joint density functional theory
http://jdftx.org
82 stars 54 forks source link

Has anyone compiled JDFTx on Polaris with GPU successfully? #326

Closed ColinBundschu closed 4 months ago

ColinBundschu commented 4 months ago

I have spent several days unsuccessfully attempting to get JDFTx to work with the CRAY environment and nvc++ compilers on ALCF's Polaris supercomputer. These include issues with not recognizing GSL 2.7 and requiring me to modify the CMakeLists.txt file directly (such as changing the cmake directory from the hard coded path to the one CRAY uses). I have been able to build it, but all tests fail. I am kind of at a loss as to how to proceed. Has anyone made this work?

shankar1729 commented 4 months ago

I'd suggest following the NERSC template: the modules shown on http://jdftx.org/Supercomputers.html should lead you to the shared build directories there. I'd suggest trying the GNU compilers through the cray wrappers first, before using nvc++. I'll say though that building on Cray systems has always been a pain.

ColinBundschu commented 4 months ago

I tried this, but the module does not seem to exist:

cbu@x3112c0s31b1n0:~> module use /global/cfs/cdirs/m4025/Software/Perlmutter/modules
cbu@x3112c0s31b1n0:~> module load jdftx/gpu
Lmod has detected the following error: The following module(s) are unknown: "jdftx/gpu"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore_cache load "jdftx/gpu"

Also make sure that all modulefiles written in TCL start with the string #%Module
cbu@x3112c0s31b1n0:~>

And the Polaris GNU compilers do not support GPU on polaris. https://docs.alcf.anl.gov/polaris/compiling-and-linking/gnu-compilers-polaris/

shankar1729 commented 4 months ago

I meant doing that on NERSC if you also had access to that machine. Anyway, if the Polaris gnu stack does not support GPU, that's irrelevant anyway.

How are you trying the compilation with the nvidia compilers? Can you post the errors you are encountering?

ColinBundschu commented 4 months ago

Ok so with the environment variables CC=cc and CXX=CC, I run the following, which is the minimum set of changes required to get it to cmake. This requires no modification of the CMakeLists.txt, but it will not compile. To get it to compile, I have to manually start changing then names of compilers in the CMakeLists.txt:

cbu@polaris-login-04:~> cd jdftx/build/
cbu@polaris-login-04:~/jdftx/build> rm -rf *; cmake -D EnableCUDA=yes -D CudaAwareMPI=yes -D PinnedHostMemory=yes\
>  -D GSL_PATH=/home/cbu/gsl\
>  -D FFTW3_PATH=/opt/cray/pe/fftw/3.3.10.6/x86_milan\
>  -D CMAKE_PREFIX_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64\
>  -D CBLAS_PATH=/home/cbu/local/lib\
>  -D CUDA_CODE=sm_80 \
>  -D CUDA_ARCH=compute_80 \
>  ../jdftx-git/jdftx
CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.

-- The C compiler identification is NVHPC 23.9.0
-- The CXX compiler identification is NVHPC 23.9.0
-- Cray Programming Environment 2.7.30 C
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/cray/pe/craype/2.7.30/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Cray Programming Environment 2.7.30 CXX
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/cray/pe/craype/2.7.30/bin/CC - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.35.3")
-- Git revision hash: 19235885
-- Looking for gsl_integration_glfixed_point
-- Looking for gsl_integration_glfixed_point - found
-- Found GSL: /home/cbu/gsl/lib/libgsl.so
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found FFTW3:  /opt/cray/pe/fftw/3.3.10.6/x86_milan/lib/libfftw3_threads.so /opt/cray/pe/fftw/3.3.10.6/x86_milan/lib/libfftw3.so
-- Looking for sgemm_
-- Looking for sgemm_ - not found
-- Looking for sgemm_
-- Looking for sgemm_ - found
-- Found BLAS: /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/compilers/lib/libblas.so
-- Looking for cheev_
-- Looking for cheev_ - not found
-- Looking for cheev_
-- Looking for cheev_ - found
-- Found LAPACK: /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/compilers/lib/liblapack.so;/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/compilers/lib/libblas.so;-fortranlibs
-- Found CBLAS: /home/cbu/local/lib/libcblas.so
-- Found MPI_C: /opt/cray/pe/craype/2.7.30/bin/cc (found version "3.1")
-- Found MPI_CXX: /opt/cray/pe/craype/2.7.30/bin/CC (found version "3.1")
-- Found MPI: TRUE (found version "3.1")
-- Performing Test HAS_NO_UNUSED_RESULT
-- Performing Test HAS_NO_UNUSED_RESULT - Failed
-- Performing Test HAS_TEMPLATE_DEPTH
-- Performing Test HAS_TEMPLATE_DEPTH - Failed
CMake Warning (dev) at CMakeLists.txt:259 (find_package):
  Policy CMP0146 is not set: The FindCUDA module is removed.  Run "cmake
  --help-policy CMP0146" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

This warning is for project developers.  Use -Wno-dev to suppress it.

-- Found CUDA: /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2 (found version "12.2")
-- CUDA_LIBRARIES = /opt/nvidia/hpc_sdk/Linux_x86_64/23.9/cuda/12.2/lib64/libcudart_static.a;Threads::Threads;dl;/usr/lib64/librt.so;/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64/libcublas.so;/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64/libcufft.so;/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64/libcublasLt.so
-- CUDA_NVCC_FLAGS = -D_FORCE_INLINES;;-arch=compute_80;-code=sm_80;-DGPU_ENABLED;--compiler-options;-fpic
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE)
-- Configuring done (9.6s)
-- Generating done (0.8s)
-- Build files have been written to: /home/cbu/jdftx/build
cbu@polaris-login-04:~/jdftx/build> make -j8
[  0%] Building NVCC (Device) object CMakeFiles/gpukernels.dir/fluid/gpukernels_generated_TranslationOperator.cu.o
[  0%] Building NVCC (Device) object CMakeFiles/gpukernels.dir/core/gpukernels_generated_Coulomb.cu.o
[  0%] Building NVCC (Device) object CMakeFiles/gpukernels.dir/core/gpukernels_generated_BlasExtra.cu.o
[  0%] Building NVCC (Device) object CMakeFiles/gpukernels.dir/core/gpukernels_generated_Operators.cu.o
[  1%] Building NVCC (Device) object CMakeFiles/gpukernels.dir/core/gpukernels_generated_matrixOperators.cu.o
[  1%] Building NVCC (Device) object CMakeFiles/gpukernels.dir/electronic/gpukernels_generated_ColumnBundleOperators.cu.o
[  2%] Building CXX object CMakeFiles/jdftxlib.dir/commands/command.cpp.o
nvcc fatal   : Unsupported NVHPC compiler found. nvc++ is the only NVHPC compiler that is supported.
nvcc fatal   : Unsupported NVHPC compiler found. nvc++ is the only NVHPC compiler that is supported.
CMake Error at gpukernels_generated_TranslationOperator.cu.o.cmake:220 (message):
  Error generating
  /home/cbu/jdftx/build/CMakeFiles/gpukernels.dir/fluid/./gpukernels_generated_TranslationOperator.cu.o

CMake Error at gpukernels_generated_ColumnBundleOperators.cu.o.cmake:220 (message):
  Error generating
  /home/cbu/jdftx/build/CMakeFiles/gpukernels.dir/electronic/./gpukernels_generated_ColumnBundleOperators.cu.o

make[2]: *** [CMakeFiles/gpukernels.dir/build.make:154: CMakeFiles/gpukernels.dir/fluid/gpukernels_generated_TranslationOperator.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [CMakeFiles/gpukernels.dir/build.make:105: CMakeFiles/gpukernels.dir/electronic/gpukernels_generated_ColumnBundleOperators.cu.o] Error 1
nvcc fatal   : Unsupported NVHPC compiler found. nvc++ is the only NVHPC compiler that is supported.
CMake Error at gpukernels_generated_BlasExtra.cu.o.cmake:220 (message):
  Error generating
  /home/cbu/jdftx/build/CMakeFiles/gpukernels.dir/core/./gpukernels_generated_BlasExtra.cu.o

make[2]: *** [CMakeFiles/gpukernels.dir/build.make:77: CMakeFiles/gpukernels.dir/core/gpukernels_generated_BlasExtra.cu.o] Error 1
nvcc fatal   : Unsupported NVHPC compiler found. nvc++ is the only NVHPC compiler that is supported.
nvcc fatal   : Unsupported NVHPC compiler found. nvc++ is the only NVHPC compiler that is supported.
CMake Error at gpukernels_generated_Coulomb.cu.o.cmake:220 (message):
  Error generating
  /home/cbu/jdftx/build/CMakeFiles/gpukernels.dir/core/./gpukernels_generated_Coulomb.cu.o

CMake Error at gpukernels_generated_Operators.cu.o.cmake:220 (message):
  Error generating
  /home/cbu/jdftx/build/CMakeFiles/gpukernels.dir/core/./gpukernels_generated_Operators.cu.o

make[2]: *** [CMakeFiles/gpukernels.dir/build.make:84: CMakeFiles/gpukernels.dir/core/gpukernels_generated_Coulomb.cu.o] Error 1
make[2]: *** [CMakeFiles/gpukernels.dir/build.make:91: CMakeFiles/gpukernels.dir/core/gpukernels_generated_Operators.cu.o] Error 1
nvcc fatal   : Unsupported NVHPC compiler found. nvc++ is the only NVHPC compiler that is supported.
[  2%] Building CXX object CMakeFiles/jdftxlib.dir/commands/coulomb_interaction.cpp.o
[  2%] Building CXX object CMakeFiles/jdftxlib.dir/commands/debug.cpp.o
CMake Error at gpukernels_generated_matrixOperators.cu.o.cmake:220 (message):
  Error generating
  /home/cbu/jdftx/build/CMakeFiles/gpukernels.dir/core/./gpukernels_generated_matrixOperators.cu.o

make[2]: *** [CMakeFiles/gpukernels.dir/build.make:98: CMakeFiles/gpukernels.dir/core/gpukernels_generated_matrixOperators.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:204: CMakeFiles/gpukernels.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[  2%] Building CXX object CMakeFiles/jdftxlib.dir/commands/density_of_states.cpp.o
[  3%] Building CXX object CMakeFiles/jdftxlib.dir/commands/dump.cpp.o
[  3%] Building CXX object CMakeFiles/jdftxlib.dir/commands/elec_ex_corr.cpp.o
[  3%] Building CXX object CMakeFiles/jdftxlib.dir/commands/elec_fillings.cpp.o
^C
Compilation terminated.
cleaning up after signal(2)...

cleaning up after signal(2)...
cleaning up after signal(2)...

cleaning up after signal(2)...

cleaning up after signal(2)...
cleaning up after signal(2)...
Compilation terminated.
cleaning up after signal(2)...

Compilation terminated.
Compilation terminated.
Compilation terminated.
Compilation terminated.
Compilation terminated.

Compilation terminated.
make[2]: *** [CMakeFiles/jdftxlib.dir/build.make:76: CMakeFiles/jdftxlib.dir/commands/command.cpp.o] Interrupt
make[2]: *** [pseudopotentials/CMakeFiles/PseudopotentialLibrary.dir/build.make:70: pseudopotentials/CMakeFiles/PseudopotentialLibrary] Interrupt
make[2]: *** [CMakeFiles/jdftxlib.dir/build.make:90: CMakeFiles/jdftxlib.dir/commands/coulomb_interaction.cpp.o] Interrupt
make[2]: *** [CMakeFiles/jdftxlib.dir/build.make:104: CMakeFiles/jdftxlib.dir/commands/debug.cpp.o] Interrupt
make[2]: *** [CMakeFiles/jdftxlib.dir/build.make:118: CMakeFiles/jdftxlib.dir/commands/density_of_states.cpp.o] Interrupt
make[2]: *** [CMakeFiles/jdftxlib.dir/build.make:132: CMakeFiles/jdftxlib.dir/commands/dump.cpp.o] Interrupt
make[2]: *** [CMakeFiles/jdftxlib.dir/build.make:146: CMakeFiles/jdftxlib.dir/commands/elec_ex_corr.cpp.o] Interrupt
make[2]: *** [CMakeFiles/jdftxlib.dir/build.make:160: CMakeFiles/jdftxlib.dir/commands/elec_fillings.cpp.o] Interrupt
make[1]: *** [CMakeFiles/Makefile2:1195: pseudopotentials/CMakeFiles/PseudopotentialLibrary.dir/all] Interrupt
make[1]: *** [CMakeFiles/Makefile2:179: CMakeFiles/jdftxlib.dir/all] Interrupt
make: *** [Makefile:146: all] Interrupt

cbu@polaris-login-04:~/jdftx/build>
ColinBundschu commented 4 months ago

It is worth noting that nvcc has been officially deprecated on Polaris and will not be supported anymore, and users are required to switch to nvc++ and nvc. https://docs.alcf.anl.gov/polaris/compiling-and-linking/nvidia-compiler-polaris/

ColinBundschu commented 4 months ago

If I make the following changes the build completes, but the tests do not run properly. (Note that I am setting CC and CXX in my .bashrc).

diff --git a/jdftx/CMakeLists.txt b/jdftx/CMakeLists.txt
index 03044be6..dbab03bb 100644
--- a/jdftx/CMakeLists.txt
+++ b/jdftx/CMakeLists.txt
@@ -3,6 +3,10 @@ cmake_policy(VERSION 3.12)

 project(JDFTx)

+find_package(CUDA REQUIRED)
+set(CUDA_HOST_COMPILER "/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/compilers/bin/nvc++")
+set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS};-ccbin=${CUDA_HOST_COMPILER}")
+
 set(CMAKE_MODULE_PATH "${CMAKE_SOURCE_DIR}/CMake-Modules/")

 #Package configuration:
@@ -257,8 +261,8 @@ option(EnableCuSolver "Whether to use cuSolver GPU LAPACK (Requires CUDA >= 9)"

 if(EnableCUDA)
        find_package(CUDA REQUIRED)
-       set(CUDA_ARCH "compute_35" CACHE STRING "CUDA virtual architecture to compile for")
-       set(CUDA_CODE "sm_35" CACHE STRING "CUDA gpu feature set version (sm_*) to compile for")
+       set(CUDA_ARCH "compute_80" CACHE STRING "CUDA virtual architecture to compile for")
+       set(CUDA_CODE "sm_80" CACHE STRING "CUDA gpu feature set version (sm_*) to compile for")
        set(CUDA_AUX_LIBRARIES ${CUDA_CUBLAS_LIBRARIES} ${CUDA_CUFFT_LIBRARIES})

        #remove libcuda.so from CUDA_LIBRARIES and save to CUDART_LIBRARY
shankar1729 commented 4 months ago

I may be that the FindCUDA.cmake mechanism is now deprecated in CMake: https://cmake.org/cmake/help/latest/module/FindCUDA.html. I suspect that we'd need to port the cmake files to use the newer cmake support for cuda as a language instead of a library.

ColinBundschu commented 4 months ago

How long do you think that would take? Right I am working with a team at LANL to put together a proposal to use JDFTx in a massive study. We need benchmarks in the next couple of weeks to put the proposal together for June, and we are targeting 20% of the Polaris supercomputer. I think it could be a big win for both us and JDFTx if we can make it happen on that timeline so we can submit the proposal this cycle.

ColinBundschu commented 4 months ago

This is the test output, if it is any help

cbu@x3006c0s25b1n0:~/jdftx/build> make testclean
Built target testclean
cbu@x3006c0s25b1n0:~/jdftx/build> export JDFTX_LAUNCH=""
cbu@x3006c0s25b1n0:~/jdftx/build> export JDFTX_SUFFIX="_gpu"
cbu@x3006c0s25b1n0:~/jdftx/build> make test
Running tests...
Test project /home/cbu/jdftx/build
      Start  1: openShell
 1/10 Test  #1: openShell ........................***Failed    6.93 sec
      Start  2: vibrations
 2/10 Test  #2: vibrations .......................***Failed    7.57 sec
      Start  3: moleculeSolvation
 3/10 Test  #3: moleculeSolvation ................***Failed    3.75 sec
      Start  4: ionSolvation
 4/10 Test  #4: ionSolvation .....................***Failed    2.68 sec
      Start  5: latticeOpt
 5/10 Test  #5: latticeOpt .......................***Failed    3.07 sec
      Start  6: metalBulk
 6/10 Test  #6: metalBulk ........................***Failed    8.16 sec
      Start  7: plusU
 7/10 Test  #7: plusU ............................***Failed    4.13 sec
      Start  8: spinOrbit
 8/10 Test  #8: spinOrbit ........................***Failed    6.95 sec
      Start  9: graphene
 9/10 Test  #9: graphene .........................***Failed    2.62 sec
      Start 10: metalSurface
10/10 Test #10: metalSurface .....................***Failed    2.92 sec

0% tests passed, 10 tests failed out of 10

Total Test time (real) =  48.82 sec

The following tests FAILED:
          1 - openShell (Failed)
          2 - vibrations (Failed)
          3 - moleculeSolvation (Failed)
          4 - ionSolvation (Failed)
          5 - latticeOpt (Failed)
          6 - metalBulk (Failed)
          7 - plusU (Failed)
          8 - spinOrbit (Failed)
          9 - graphene (Failed)
         10 - metalSurface (Failed)
Errors while running CTest
Output from these tests are in: /home/cbu/jdftx/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
make: *** [Makefile:71: test] Error 8
cbu@x3006c0s25b1n0:~/jdftx/build> cat /home/cbu/jdftx/build/Testing/Temporary/LastTest.log
Start testing: May 08 20:30 UTC
----------------------------------------------------------
1/10 Testing: openShell
1/10 Test: openShell
Command: "/home/cbu/jdftx/jdftx-git/jdftx/test/runTest.sh" "openShell" "/home/cbu/jdftx/jdftx-git/jdftx/test" "/home/cbu/jdftx/build/test" "/home/cbu/jdftx/build"
Directory: /home/cbu/jdftx/build/test
"openShell" start time: May 08 20:30 UTC
Output:
----------------------------------------------------------
launch=""
<end of output>
Test time =   6.93 sec
----------------------------------------------------------
Test Failed.
"openShell" end time: May 08 20:30 UTC
"openShell" time elapsed: 00:00:06
----------------------------------------------------------

2/10 Testing: vibrations
2/10 Test: vibrations
Command: "/home/cbu/jdftx/jdftx-git/jdftx/test/runTest.sh" "vibrations" "/home/cbu/jdftx/jdftx-git/jdftx/test" "/home/cbu/jdftx/build/test" "/home/cbu/jdftx/build"
Directory: /home/cbu/jdftx/build/test
"vibrations" start time: May 08 20:30 UTC
Output:
----------------------------------------------------------
launch=""

ERROR: Ions H #0 and H #1 are on top of eachother.

Failed.
<end of output>
Test time =   7.57 sec
----------------------------------------------------------
Test Failed.
"vibrations" end time: May 08 20:30 UTC
"vibrations" time elapsed: 00:00:07
----------------------------------------------------------

3/10 Testing: moleculeSolvation
3/10 Test: moleculeSolvation
Command: "/home/cbu/jdftx/jdftx-git/jdftx/test/runTest.sh" "moleculeSolvation" "/home/cbu/jdftx/jdftx-git/jdftx/test" "/home/cbu/jdftx/build/test" "/home/cbu/jdftx/build"
Directory: /home/cbu/jdftx/build/test
"moleculeSolvation" start time: May 08 20:30 UTC
Output:
----------------------------------------------------------
launch=""
Atom 2 lies within the margin of 5 bohrs from the truncation boundary.
Expand unit cell, or if absolutely sure, reduce coulomb-truncation-ion-margin.
Failed.
<end of output>
Test time =   3.75 sec
----------------------------------------------------------
Test Failed.
"moleculeSolvation" end time: May 08 20:30 UTC
"moleculeSolvation" time elapsed: 00:00:03
----------------------------------------------------------

4/10 Testing: ionSolvation
4/10 Test: ionSolvation
Command: "/home/cbu/jdftx/jdftx-git/jdftx/test/runTest.sh" "ionSolvation" "/home/cbu/jdftx/jdftx-git/jdftx/test" "/home/cbu/jdftx/build/test" "/home/cbu/jdftx/build"
Directory: /home/cbu/jdftx/build/test
"ionSolvation" start time: May 08 20:30 UTC
Output:
----------------------------------------------------------
launch=""
/home/cbu/jdftx/jdftx-git/jdftx/core/RadialFunction.cpp:140: initWeights:
        Assertion 'r[i+1]>r[i]' failedMPICH ERROR [Rank 0] [job id ] [Wed May  8 20:31:01 2024] [x3006c0s25b1n0] - Abort(1) (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

<end of output>
Test time =   2.68 sec
----------------------------------------------------------
Test Failed.
"ionSolvation" end time: May 08 20:31 UTC
"ionSolvation" time elapsed: 00:00:02
----------------------------------------------------------

5/10 Testing: latticeOpt
5/10 Test: latticeOpt
Command: "/home/cbu/jdftx/jdftx-git/jdftx/test/runTest.sh" "latticeOpt" "/home/cbu/jdftx/jdftx-git/jdftx/test" "/home/cbu/jdftx/build/test" "/home/cbu/jdftx/build"
Directory: /home/cbu/jdftx/build/test
"latticeOpt" start time: May 08 20:31 UTC
Output:
----------------------------------------------------------
launch=""
<end of output>
Test time =   3.07 sec
----------------------------------------------------------
Test Failed.
"latticeOpt" end time: May 08 20:31 UTC
"latticeOpt" time elapsed: 00:00:03
----------------------------------------------------------

6/10 Testing: metalBulk
6/10 Test: metalBulk
Command: "/home/cbu/jdftx/jdftx-git/jdftx/test/runTest.sh" "metalBulk" "/home/cbu/jdftx/jdftx-git/jdftx/test" "/home/cbu/jdftx/build/test" "/home/cbu/jdftx/build"
Directory: /home/cbu/jdftx/build/test
"metalBulk" start time: May 08 20:31 UTC
Output:
----------------------------------------------------------
launch=""
<end of output>
Test time =   8.16 sec
----------------------------------------------------------
Test Failed.
"metalBulk" end time: May 08 20:31 UTC
"metalBulk" time elapsed: 00:00:08
----------------------------------------------------------

7/10 Testing: plusU
7/10 Test: plusU
Command: "/home/cbu/jdftx/jdftx-git/jdftx/test/runTest.sh" "plusU" "/home/cbu/jdftx/jdftx-git/jdftx/test" "/home/cbu/jdftx/build/test" "/home/cbu/jdftx/build"
Directory: /home/cbu/jdftx/build/test
"plusU" start time: May 08 20:31 UTC
Output:
----------------------------------------------------------
launch=""
<end of output>
Test time =   4.13 sec
----------------------------------------------------------
Test Failed.
"plusU" end time: May 08 20:31 UTC
"plusU" time elapsed: 00:00:04
----------------------------------------------------------

8/10 Testing: spinOrbit
8/10 Test: spinOrbit
Command: "/home/cbu/jdftx/jdftx-git/jdftx/test/runTest.sh" "spinOrbit" "/home/cbu/jdftx/jdftx-git/jdftx/test" "/home/cbu/jdftx/build/test" "/home/cbu/jdftx/build"
Directory: /home/cbu/jdftx/build/test
"spinOrbit" start time: May 08 20:31 UTC
Output:
----------------------------------------------------------
launch=""
<end of output>
Test time =   6.95 sec
----------------------------------------------------------
Test Failed.
"spinOrbit" end time: May 08 20:31 UTC
"spinOrbit" time elapsed: 00:00:06
----------------------------------------------------------

9/10 Testing: graphene
9/10 Test: graphene
Command: "/home/cbu/jdftx/jdftx-git/jdftx/test/runTest.sh" "graphene" "/home/cbu/jdftx/jdftx-git/jdftx/test" "/home/cbu/jdftx/build/test" "/home/cbu/jdftx/build"
Directory: /home/cbu/jdftx/build/test
"graphene" start time: May 08 20:31 UTC
Output:
----------------------------------------------------------
launch=""
Separation between atoms 1 and 1 lies within the margin of 5 bohrs from the Wigner-Seitz boundary.
Expand unit cell, or if absolutely sure, reduce coulomb-truncation-ion-margin.
Failed.
<end of output>
Test time =   2.62 sec
----------------------------------------------------------
Test Failed.
"graphene" end time: May 08 20:31 UTC
"graphene" time elapsed: 00:00:02
----------------------------------------------------------

10/10 Testing: metalSurface
10/10 Test: metalSurface
Command: "/home/cbu/jdftx/jdftx-git/jdftx/test/runTest.sh" "metalSurface" "/home/cbu/jdftx/jdftx-git/jdftx/test" "/home/cbu/jdftx/build/test" "/home/cbu/jdftx/build"
Directory: /home/cbu/jdftx/build/test
"metalSurface" start time: May 08 20:31 UTC
Output:
----------------------------------------------------------
launch=""
/home/cbu/jdftx/jdftx-git/jdftx/core/RadialFunction.cpp:140: initWeights:
        Assertion 'r[i+1]>r[i]' failedMPICH ERROR [Rank 0] [job id ] [Wed May  8 20:31:28 2024] [x3006c0s25b1n0] - Abort(1) (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

<end of output>
Test time =   2.92 sec
----------------------------------------------------------
Test Failed.
"metalSurface" end time: May 08 20:31 UTC
"metalSurface" time elapsed: 00:00:02
----------------------------------------------------------

End testing: May 08 20:31 UTC
cbu@x3006c0s25b1n0:~/jdftx/build>
shankar1729 commented 4 months ago

Look at/post the full log files within the test subdirectory of the build. The real error messages are likely buried there, and may have something to do with the launch of the executable if they all failed.

ColinBundschu commented 4 months ago

Here are the results from the vibrations test. The geometry minimization seems to have simply stacked the two atoms on top of each other, which caused the subsequent test to fail. The energy seems to have been markedly decreased by this stacking, which suggests to me that the code is not properly accounting for Coulombic repulsion of the nuclei.

Geometry:

cbu@polaris-login-01:~/jdftx/build/test/vibrations> cat H2_geometry.out

*************** JDFTx 1.7.0 (git hash 19235885) ***************

Start date and time: Wed May  8 20:51:13 2024
Executable /home/cbu/jdftx/build/jdftx_gpu with command-line: -i /home/cbu/jdftx/jdftx-git/jdftx/test/vibrations/H2_geometry.in -d -o H2_geometry.out
Running on hosts (process indices):  x3006c0s25b1n0 (0)
Divided in process groups (process indices):  0 (0)
gpuInit: Found compatible cuda device 0 'NVIDIA A100-SXM4-40GB'
gpuInit: Found compatible cuda device 1 'NVIDIA A100-SXM4-40GB'
gpuInit: Found compatible cuda device 2 'NVIDIA A100-SXM4-40GB'
gpuInit: Found compatible cuda device 3 'NVIDIA A100-SXM4-40GB'
gpuInit: Selected device 0
Resource initialization completed at t[s]:      1.57
Run totals: 1 processes, 32 threads, 1 GPUs

Input parsed successfully to the following command list (including defaults):

basis kpoint-dependent
coords-type Cartesian
core-overlap-check none
coulomb-interaction Isolated
coulomb-truncation-embed 0 5.95 0.05
davidson-band-ratio 1.1
dump End None IonicPositions
dump
dump-name H2.$VAR
elec-cutoff 20 100
elec-eigen-algo Davidson
elec-ex-corr gga-PBE
electronic-minimize  \
        dirUpdateScheme      FletcherReeves \
        linminMethod         DirUpdateRecommended \
        nIterations          100 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  1e-08 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
electronic-scf  \
        nIterations     50 \
        energyDiffThreshold     1e-08 \
        residualThreshold       1e-07 \
        mixFraction     0.5 \
        qMetric 0.8 \
        history 10 \
        nEigSteps       2 \
        eigDiffThreshold        1e-08 \
        mixedVariable   Density \
        qKerker 0.8 \
        qKappa  -1 \
        verbose no \
        mixFractionMag  1.5
exchange-regularization None
fluid None
fluid-ex-corr lda-TF lda-PZ
fluid-gummel-loop 10 1.000000e-05
fluid-minimize  \
        dirUpdateScheme      PolakRibiere \
        linminMethod         DirUpdateRecommended \
        nIterations          100 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  0 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
fluid-solvent H2O 55.338 ScalarEOS \
        epsBulk 78.4 \
        pMol 0.92466 \
        epsInf 1.77 \
        Pvap 1.06736e-10 \
        sigmaBulk 4.62e-05 \
        Rvdw 2.61727 \
        Res 1.42 \
        tauNuc 343133 \
        poleEl 15 7 1
forces-output-coords Positions
ion H   0.000000000000000   6.100000000000000   0.700000000000000 1
ion H   0.000000000000000   5.799999999999999  -0.600000000000000 1
ion-species GBRV/$ID_pbe.uspp
ion-width 0
ionic-minimize  \
        dirUpdateScheme      L-BFGS \
        linminMethod         DirUpdateRecommended \
        nIterations          10 \
        history              15 \
        knormThreshold       0.0001 \
        maxThreshold         no \
        energyDiffThreshold  1e-06 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
kpoint   0.000000000000   0.000000000000   0.000000000000  1.00000000000000
kpoint-folding 1 1 1
latt-move-scale 1 1 1
latt-scale 1 1 1
lattice Cubic 12
lattice-minimize  \
        dirUpdateScheme      L-BFGS \
        linminMethod         DirUpdateRecommended \
        nIterations          0 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  1e-06 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
lcao-params -1 1e-06 0.001
pcm-variant GLSSA13
perturb-minimize  \
        nIterations            0 \
        algorithm              MINRES \
        residualTol            0.0001 \
        residualDiffThreshold  0.0001 \
        CGBypass               no \
        recomputeResidual      no
spintype no-spin
subspace-rotation-factor 1 yes
symmetries automatic
symmetry-threshold 0.0001

---------- Setting up symmetries ----------

Found 48 point-group symmetries of the bravais lattice
Found 4 space-group symmetries with basis
Applied RMS atom displacement 0 bohrs to make symmetries exact.

---------- Initializing the Grid ----------
R =
[           12            0            0  ]
[            0           12            0  ]
[            0            0           12  ]
unit cell volume = 1728
G =
[   0.523599          0          0  ]
[          0   0.523599          0  ]
[          0          0   0.523599  ]
Minimum fftbox size, Smin = [  56  56  56  ]
Chosen fftbox size, S = [  56  56  56  ]

---------- Initializing tighter grid for wavefunction operations ----------
R =
[           12            0            0  ]
[            0           12            0  ]
[            0            0           12  ]
unit cell volume = 1728
G =
[   0.523599          0          0  ]
[          0   0.523599          0  ]
[          0          0   0.523599  ]
Minimum fftbox size, Smin = [  52  52  52  ]
Chosen fftbox size, S = [  54  54  54  ]

---------- Exchange Correlation functional ----------
Initalized PBE GGA exchange.
Initalized PBE GGA correlation.

---------- Setting up pseudopotentials ----------
Width of ionic core gaussian charges (only for fluid interactions / plotting) set to 0

Reading pseudopotential file '/home/cbu/jdftx/build/pseudopotentials/GBRV/h_pbe.uspp':
  Title: H.  Created by USPP 7.3.6 on 2-4-15
  Reference state energy: -0.458849.  1 valence electrons in orbitals:
    |100>  occupation: 1  eigenvalue: -0.238595
  lMax: 0  lLocal: 1  QijEcut: 6
  2 projectors sampled on a log grid with 395 points:
    l: 0  eig: -0.238595  rCut: 1.2
    l: 0  eig: 1.000000  rCut: 1.2
  Transforming local potential to a uniform radial grid of dG=0.02 with 1275 points.
  Transforming nonlocal projectors to a uniform radial grid of dG=0.02 with 432 points.
  Transforming density augmentations to a uniform radial grid of dG=0.02 with 1275 points.
  Transforming atomic orbitals to a uniform radial grid of dG=0.02 with 432 points.
  Core radius for overlap checks: 1.20 bohrs.

Initialized 1 species with 2 total atoms.

Folded 1 k-points by 1x1x1 to 1 k-points.

---------- Setting up k-points, bands, fillings ----------
No reducable k-points.
Computing the number of bands and number of electrons
Calculating initial fillings.
nElectrons:   2.000000   nBands: 1   nStates: 1

----- Setting up reduced wavefunction bases (one per k-point) -----
average nbasis = 7249.000 , ideal nbasis = 7382.148

---------- Setting up coulomb interaction ----------
Setting up double-sized grid for truncated Coulomb potentials:
R =
[           24            0            0  ]
[            0           24            0  ]
[            0            0           24  ]
unit cell volume = 13824
G =
[   0.261799          0          0  ]
[          0   0.261799          0  ]
[          0          0   0.261799  ]
Chosen fftbox size, S = [  112  112  112  ]
Integer grid location selected as the embedding center:
   Grid: [  0  28  0  ]
   Lattice: [  0  0.495833  0.00416667  ]
   Cartesian: [  0  5.95  0.05  ]
Constructing Wigner-Seitz cell: 6 faces (6 quadrilaterals, 0 hexagons)
Range-separation parameter for embedded mesh potentials due to point charges: 0.583992 bohrs.
Constructing Wigner-Seitz cell: 6 faces (6 quadrilaterals, 0 hexagons)
Gaussian width for range separation: 1.26443 bohrs.
FFT grid for long-range part: [ 112 112 112 ].
Planning fourier transform ... Done.
Computing truncated long-range part in real space ... Done.
Adding short-range part in reciprocal space ... Done.

---------- Allocating electronic variables ----------
Initializing wave functions:  linear combination of atomic orbitals
H pseudo-atom occupations:   s ( 1 )
        FillingsUpdate:  mu: -0.000000000  nElectrons: 2.000000
LCAOMinimize: Iter:   0  Etot: -1.8376329335123189  |grad|_K:  8.335e-05  alpha:  1.000e+00
        FillingsUpdate:  mu: -0.000000000  nElectrons: 2.000000
LCAOMinimize: Iter:   1  Etot: -1.8376329649578773  |grad|_K:  2.278e-07  alpha:  1.006e+00  linmin:  4.589e-02  cgtest: -9.881e-01  t[s]:      1.82
        FillingsUpdate:  mu: -0.000000000  nElectrons: 2.000000
LCAOMinimize: Iter:   2  Etot: -1.8376329649578789  |grad|_K:  2.266e-07  alpha:  7.779e-01  linmin: -3.148e-05  cgtest:  9.947e-01  t[s]:      1.85
LCAOMinimize: Encountered beta<0, resetting CG.
LCAOMinimize: Converged (|Delta Etot|<1.000000e-06 for 2 iters).

---- Citations for features of the code used in this run ----

   Software package:
      R. Sundararaman, K. Letchworth-Weaver, K.A. Schwarz, D. Gunceler, Y. Ozhabes and T.A. Arias, 'JDFTx: software for joint density-functional theory', SoftwareX 6, 278 (2017)

   gga-PBE exchange-correlation functional:
      J.P. Perdew, K. Burke and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996)

   Pseudopotentials:
      KF Garrity, JW Bennett, KM Rabe and D Vanderbilt, Comput. Mater. Sci. 81, 446 (2014)

   Truncated Coulomb potentials:
      R. Sundararaman and T.A. Arias, Phys. Rev. B 87, 165122 (2013)

This list may not be complete. Please suggest additional citations or
report any other bugs at https://github.com/shankar1729/jdftx/issues

Initialization completed successfully at t[s]:      1.86

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -1.904249545252813   dEtot: -6.662e-02   |Residual|: 1.684e-01   |deigs|: 5.044e-02  t[s]:      1.90
SCF: Cycle:  1   Etot: -1.909747571461371   dEtot: -5.498e-03   |Residual|: 8.714e-02   |deigs|: 5.164e-02  t[s]:      1.94
SCF: Cycle:  2   Etot: -1.912581025563360   dEtot: -2.833e-03   |Residual|: 1.701e-02   |deigs|: 6.453e-02  t[s]:      1.97
SCF: Cycle:  3   Etot: -1.912539290616327   dEtot: +4.173e-05   |Residual|: 7.439e-03   |deigs|: 2.128e-03  t[s]:      2.00
SCF: Cycle:  4   Etot: -1.912598868672670   dEtot: -5.958e-05   |Residual|: 2.900e-03   |deigs|: 1.232e-02  t[s]:      2.04
SCF: Cycle:  5   Etot: -1.912598103401018   dEtot: +7.653e-07   |Residual|: 2.463e-03   |deigs|: 2.555e-03  t[s]:      2.07
SCF: Cycle:  6   Etot: -1.912602121870508   dEtot: -4.018e-06   |Residual|: 5.919e-03   |deigs|: 5.033e-03  t[s]:      2.11
SCF: Cycle:  7   Etot: -1.912609276485322   dEtot: -7.155e-06   |Residual|: 5.619e-04   |deigs|: 3.031e-03  t[s]:      2.14
SCF: Cycle:  8   Etot: -1.912610297224868   dEtot: -1.021e-06   |Residual|: 2.583e-04   |deigs|: 9.419e-05  t[s]:      2.17
SCF: Cycle:  9   Etot: -1.912610381967555   dEtot: -8.474e-08   |Residual|: 6.159e-05   |deigs|: 3.186e-04  t[s]:      2.21
SCF: Cycle: 10   Etot: -1.912610376206672   dEtot: +5.761e-09   |Residual|: 2.467e-04   |deigs|: 3.076e-04  t[s]:      2.24
SCF: Cycle: 11   Etot: -1.912610407078776   dEtot: -3.087e-08   |Residual|: 1.270e-05   |deigs|: 2.406e-04  t[s]:      2.28
SCF: Cycle: 12   Etot: -1.912610407582477   dEtot: -5.037e-10   |Residual|: 2.026e-05   |deigs|: 2.496e-05  t[s]:      2.31
SCF: Cycle: 13   Etot: -1.912610407627083   dEtot: -4.461e-11   |Residual|: 1.542e-05   |deigs|: 8.762e-06  t[s]:      2.34
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.100000000000000   0.700000000000000 1
ion H   0.000000000000000   5.799999999999999  -0.600000000000000 1

# Forces in Cartesian coordinates:
force H   0.000000000000000  -0.115863210857906  -0.502060515389305 1
force H   0.000000000000000   0.115863210857906   0.502060515389305 1

# Energy components:
   Eewald =        0.0000000000000000
       EH =        1.3330816794631113
     Eloc =       -3.4393520248997009
      Enl =       -0.0591448787908532
      Exc =       -0.6998246806543235
       KE =        0.9526294972546838
-------------------------------------
     Etot =       -1.9126104076270829

IonicMinimize: Iter:   0  Etot: -1.912610407627083  |grad|_K:  2.975e-01  t[s]:      2.36

#--- Lowdin population analysis ---
# oxidation-state H +0.044 +0.044

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -2.021402509970021   dEtot: -8.062e-03   |Residual|: 5.705e-02   |deigs|: 5.280e-03  t[s]:      2.40
SCF: Cycle:  1   Etot: -2.022030814420550   dEtot: -6.283e-04   |Residual|: 2.584e-02   |deigs|: 1.474e-02  t[s]:      2.43
SCF: Cycle:  2   Etot: -2.022184360569111   dEtot: -1.535e-04   |Residual|: 2.953e-03   |deigs|: 1.295e-02  t[s]:      2.46
SCF: Cycle:  3   Etot: -2.022186249459003   dEtot: -1.889e-06   |Residual|: 1.157e-03   |deigs|: 1.385e-03  t[s]:      2.49
SCF: Cycle:  4   Etot: -2.022186889976192   dEtot: -6.405e-07   |Residual|: 4.669e-04   |deigs|: 9.322e-04  t[s]:      2.52
SCF: Cycle:  5   Etot: -2.022187131597323   dEtot: -2.416e-07   |Residual|: 2.357e-04   |deigs|: 6.930e-04  t[s]:      2.56
SCF: Cycle:  6   Etot: -2.022187245916471   dEtot: -1.143e-07   |Residual|: 3.874e-04   |deigs|: 3.832e-04  t[s]:      2.59
SCF: Cycle:  7   Etot: -2.022187291404419   dEtot: -4.549e-08   |Residual|: 7.332e-05   |deigs|: 1.337e-04  t[s]:      2.62
SCF: Cycle:  8   Etot: -2.022187318629253   dEtot: -2.722e-08   |Residual|: 8.700e-05   |deigs|: 7.608e-05  t[s]:      2.66
SCF: Cycle:  9   Etot: -2.022187321248311   dEtot: -2.619e-09   |Residual|: 1.346e-05   |deigs|: 4.562e-05  t[s]:      2.69
SCF: Cycle: 10   Etot: -2.022187322004727   dEtot: -7.564e-10   |Residual|: 1.687e-05   |deigs|: 7.119e-06  t[s]:      2.73
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian
IonicMinimize:  Wolfe criterion not satisfied: alpha: 0.194078  (E-E0)/|gdotd0|: -0.206368  gdotd/gdotd0: 1.12742 (taking cubic step)

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -1.728454648100504   dEtot: -1.912e-02   |Residual|: 8.360e-02   |deigs|: 1.312e-02  t[s]:      2.78
SCF: Cycle:  1   Etot: -1.730209582888631   dEtot: -1.755e-03   |Residual|: 3.847e-02   |deigs|: 2.069e-02  t[s]:      2.81
SCF: Cycle:  2   Etot: -1.730597423783121   dEtot: -3.878e-04   |Residual|: 3.881e-03   |deigs|: 1.813e-02  t[s]:      2.85
SCF: Cycle:  3   Etot: -1.730607966903391   dEtot: -1.054e-05   |Residual|: 2.110e-03   |deigs|: 2.711e-03  t[s]:      2.88
SCF: Cycle:  4   Etot: -1.730613043799341   dEtot: -5.077e-06   |Residual|: 9.040e-04   |deigs|: 1.737e-03  t[s]:      2.92
SCF: Cycle:  5   Etot: -1.730614842954628   dEtot: -1.799e-06   |Residual|: 9.564e-04   |deigs|: 1.836e-03  t[s]:      2.95
SCF: Cycle:  6   Etot: -1.730614922182272   dEtot: -7.923e-08   |Residual|: 5.030e-04   |deigs|: 3.860e-04  t[s]:      2.98
SCF: Cycle:  7   Etot: -1.730615175749016   dEtot: -2.536e-07   |Residual|: 2.033e-04   |deigs|: 2.867e-04  t[s]:      3.02
SCF: Cycle:  8   Etot: -1.730615199876167   dEtot: -2.413e-08   |Residual|: 7.055e-05   |deigs|: 5.504e-05  t[s]:      3.05
SCF: Cycle:  9   Etot: -1.730615203697499   dEtot: -3.821e-09   |Residual|: 2.151e-05   |deigs|: 7.197e-05  t[s]:      3.08
SCF: Cycle: 10   Etot: -1.730615204726150   dEtot: -1.029e-09   |Residual|: 1.556e-05   |deigs|: 8.190e-06  t[s]:      3.11
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian
IonicMinimize:  Wolfe criterion not satisfied: alpha: -0.388156  (E-E0)/|gdotd0|: 0.342755  gdotd/gdotd0: 0.774166 (taking cubic step)

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -2.580426700531026   dEtot: -3.953e-01   |Residual|: 5.178e-01   |deigs|: 2.726e-01  t[s]:      3.16
SCF: Cycle:  1   Etot: -2.602989294729117   dEtot: -2.256e-02   |Residual|: 2.433e-01   |deigs|: 1.437e-01  t[s]:      3.20
SCF: Cycle:  2   Etot: -2.614976600864398   dEtot: -1.199e-02   |Residual|: 4.802e-02   |deigs|: 1.568e-01  t[s]:      3.23
SCF: Cycle:  3   Etot: -2.614222830989691   dEtot: +7.538e-04   |Residual|: 5.768e-02   |deigs|: 2.554e-02  t[s]:      3.26
SCF: Cycle:  4   Etot: -2.614920703385500   dEtot: -6.979e-04   |Residual|: 1.178e-02   |deigs|: 3.140e-02  t[s]:      3.30
SCF: Cycle:  5   Etot: -2.614987764871225   dEtot: -6.706e-05   |Residual|: 8.560e-03   |deigs|: 1.270e-02  t[s]:      3.33
SCF: Cycle:  6   Etot: -2.614990858324562   dEtot: -3.093e-06   |Residual|: 3.485e-03   |deigs|: 1.276e-03  t[s]:      3.36
SCF: Cycle:  7   Etot: -2.614996942311987   dEtot: -6.084e-06   |Residual|: 4.396e-04   |deigs|: 4.568e-04  t[s]:      3.40
SCF: Cycle:  8   Etot: -2.614997600507920   dEtot: -6.582e-07   |Residual|: 4.822e-04   |deigs|: 6.314e-05  t[s]:      3.43
SCF: Cycle:  9   Etot: -2.614997747890890   dEtot: -1.474e-07   |Residual|: 1.246e-04   |deigs|: 4.927e-05  t[s]:      3.46
SCF: Cycle: 10   Etot: -2.614997765113833   dEtot: -1.722e-08   |Residual|: 1.400e-05   |deigs|: 5.347e-05  t[s]:      3.49
SCF: Cycle: 11   Etot: -2.614997766455851   dEtot: -1.342e-09   |Residual|: 7.898e-05   |deigs|: 9.263e-06  t[s]:      3.52
SCF: Cycle: 12   Etot: -2.614997767204158   dEtot: -7.483e-10   |Residual|: 4.473e-06   |deigs|: 3.221e-05  t[s]:      3.56
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   5.942594357177585   0.017927083364497 1
ion H   0.000000000000000   5.957405642822414   0.082072916635503 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   0.025280163943094   0.109492041713793 1
force H   0.000000000000000  -0.025280163943094  -0.109492041713793 1

# Energy components:
   Eewald =        0.0000000000000002
       EH =        1.8231402172903257
     Eloc =       -4.9113088831071217
      Enl =        0.0338888980186783
      Exc =       -0.9386064694026413
       KE =        1.3778884699966008
-------------------------------------
     Etot =       -2.6149977672041582

IonicMinimize: Iter:   1  Etot: -2.614997767204158  |grad|_K:  6.488e-02  alpha:  1.359e+00  linmin:  1.000e+00  t[s]:      3.57

#--- Lowdin population analysis ---
# oxidation-state H +0.118 +0.118

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -2.603432559287529   dEtot: -2.704e-05   |Residual|: 4.953e-03   |deigs|: 2.032e-05  t[s]:      3.61
SCF: Cycle:  1   Etot: -2.603435978443060   dEtot: -3.419e-06   |Residual|: 1.931e-03   |deigs|: 1.388e-03  t[s]:      3.64
SCF: Cycle:  2   Etot: -2.603436695621731   dEtot: -7.172e-07   |Residual|: 1.584e-04   |deigs|: 9.629e-04  t[s]:      3.67
SCF: Cycle:  3   Etot: -2.603436702843053   dEtot: -7.221e-09   |Residual|: 4.693e-05   |deigs|: 9.212e-05  t[s]:      3.70
SCF: Cycle:  4   Etot: -2.603436708072381   dEtot: -5.229e-09   |Residual|: 3.234e-05   |deigs|: 5.336e-05  t[s]:      3.73
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian
IonicMinimize:  Wolfe criterion not satisfied: alpha: 0.797893  (E-E0)/|gdotd0|: 0.410442  gdotd/gdotd0: -2.0022 (taking cubic step)

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -2.618702421434915   dEtot: -2.623e-05   |Residual|: 5.185e-03   |deigs|: 2.093e-05  t[s]:      3.79
SCF: Cycle:  1   Etot: -2.618706182870123   dEtot: -3.761e-06   |Residual|: 2.016e-03   |deigs|: 1.495e-03  t[s]:      3.82
SCF: Cycle:  2   Etot: -2.618707030074561   dEtot: -8.472e-07   |Residual|: 1.620e-04   |deigs|: 1.051e-03  t[s]:      3.85
SCF: Cycle:  3   Etot: -2.618707034507517   dEtot: -4.433e-09   |Residual|: 1.006e-04   |deigs|: 1.102e-04  t[s]:      3.87
SCF: Cycle:  4   Etot: -2.618707038813709   dEtot: -4.306e-09   |Residual|: 2.653e-05   |deigs|: 4.878e-05  t[s]:      3.91
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   5.949955447375585   0.049811759727932 1
ion H   0.000000000000000   5.950044552624414   0.050188240272068 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   0.000152946528833   0.000646260431109 1
force H   0.000000000000000  -0.000152946528833  -0.000646260431109 1

# Energy components:
   Eewald =        0.0000000000000292
       EH =        1.8257450960557711
     Eloc =       -4.9190429224746488
      Enl =        0.0348894955211563
      Exc =       -0.9399474879827701
       KE =        1.3796487800667530
-------------------------------------
     Etot =       -2.6187070388137093

IonicMinimize: Iter:   2  Etot: -2.618707038813709  |grad|_K:  3.834e-04  alpha:  2.611e-01  linmin: -1.000e+00  t[s]:      3.92

#--- Lowdin population analysis ---
# oxidation-state H +0.118 +0.118

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -2.618707169440226   dEtot: -2.169e-09   |Residual|: 6.815e-06   |deigs|: 1.095e-09  t[s]:      3.96
SCF: Cycle:  1   Etot: -2.618707169540441   dEtot: -1.002e-10   |Residual|: 3.130e-06   |deigs|: 2.635e-06  t[s]:      3.99
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   5.950000253439525   0.050001071669412 1
ion H   0.000000000000000   5.949999746560474   0.049998928330588 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   0.000009257702050  -0.000003679220468 1
force H   0.000000000000000  -0.000009257702050   0.000003679220468 1

# Energy components:
   Eewald =        0.0000000000051335
       EH =        1.8257317884947917
     Eloc =       -4.9190194205760012
      Enl =        0.0348884284672219
      Exc =       -0.9399412400986663
       KE =        1.3796332741670800
-------------------------------------
     Etot =       -2.6187071695404409

IonicMinimize: Iter:   3  Etot: -2.618707169540441  |grad|_K:  5.752e-06  alpha:  1.000e+00  linmin:  1.454e-01  t[s]:      4.00
IonicMinimize: Converged (|grad|_K<1.000000e-04).

#--- Lowdin population analysis ---
# oxidation-state H +0.118 +0.118

Dumping 'H2.ionpos' ... done
End date and time: Wed May  8 20:51:17 2024  (Duration: 0-0:00:04.00)
Done!
cbu@polaris-login-01:~/jdftx/build/test/vibrations>

Vibrations:

cbu@polaris-login-01:~/jdftx/build/test/vibrations> cat H2_vibrations.out

*************** JDFTx 1.7.0 (git hash 19235885) ***************

Start date and time: Wed May  8 20:51:18 2024
Executable /home/cbu/jdftx/build/jdftx_gpu with command-line: -i /home/cbu/jdftx/jdftx-git/jdftx/test/vibrations/H2_vibrations.in -d -o H2_vibrations.out
Running on hosts (process indices):  x3006c0s25b1n0 (0)
Divided in process groups (process indices):  0 (0)
gpuInit: Found compatible cuda device 0 'NVIDIA A100-SXM4-40GB'
gpuInit: Found compatible cuda device 1 'NVIDIA A100-SXM4-40GB'
gpuInit: Found compatible cuda device 2 'NVIDIA A100-SXM4-40GB'
gpuInit: Found compatible cuda device 3 'NVIDIA A100-SXM4-40GB'
gpuInit: Selected device 0
Resource initialization completed at t[s]:      1.59
Run totals: 1 processes, 32 threads, 1 GPUs

Input parsed successfully to the following command list (including defaults):

basis kpoint-dependent
coords-type Cartesian
core-overlap-check none
coulomb-interaction Isolated
coulomb-truncation-embed 0 5.95 0.05
davidson-band-ratio 1.1
dump End None
dump
dump-name H2.$VAR
elec-cutoff 20 100
elec-eigen-algo Davidson
elec-ex-corr gga-PBE
electronic-minimize  \
        dirUpdateScheme      FletcherReeves \
        linminMethod         DirUpdateRecommended \
        nIterations          100 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  1e-08 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
electronic-scf  \
        nIterations     50 \
        energyDiffThreshold     1e-08 \
        residualThreshold       1e-07 \
        mixFraction     0.5 \
        qMetric 0.8 \
        history 10 \
        nEigSteps       2 \
        eigDiffThreshold        1e-08 \
        mixedVariable   Density \
        qKerker 0.8 \
        qKappa  -1 \
        verbose no \
        mixFractionMag  1.5
exchange-regularization None
fluid None
fluid-ex-corr lda-TF lda-PZ
fluid-gummel-loop 10 1.000000e-05
fluid-minimize  \
        dirUpdateScheme      PolakRibiere \
        linminMethod         DirUpdateRecommended \
        nIterations          100 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  0 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
fluid-solvent H2O 55.338 ScalarEOS \
        epsBulk 78.4 \
        pMol 0.92466 \
        epsInf 1.77 \
        Pvap 1.06736e-10 \
        sigmaBulk 4.62e-05 \
        Rvdw 2.61727 \
        Res 1.42 \
        tauNuc 343133 \
        poleEl 15 7 1
forces-output-coords Positions
ion H   0.000000000000000   5.950000253439525   0.050001071669412 1
ion H   0.000000000000000   5.949999746560474   0.049998928330588 1
ion-species GBRV/$ID_pbe.uspp
ion-width 0
ionic-minimize  \
        dirUpdateScheme      L-BFGS \
        linminMethod         DirUpdateRecommended \
        nIterations          0 \
        history              15 \
        knormThreshold       0.0001 \
        maxThreshold         no \
        energyDiffThreshold  1e-06 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
kpoint   0.000000000000   0.000000000000   0.000000000000  1.00000000000000
kpoint-folding 1 1 1
latt-move-scale 1 1 1
latt-scale 1 1 1
lattice Cubic 12
lattice-minimize  \
        dirUpdateScheme      L-BFGS \
        linminMethod         DirUpdateRecommended \
        nIterations          0 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  1e-06 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
lcao-params -1 1e-06 0.001
pcm-variant GLSSA13
perturb-minimize  \
        nIterations            0 \
        algorithm              MINRES \
        residualTol            0.0001 \
        residualDiffThreshold  0.0001 \
        CGBypass               no \
        recomputeResidual      no
spintype no-spin
subspace-rotation-factor 1 yes
symmetries automatic
symmetry-threshold 0.0001
vibrations \
        dr 0.01\
        centralDiff yes\
        useConstraints no\
        translationSym yes\
        rotationSym yes\
        omegaMin 0.0002\
        T 298\
        omegaResolution 0.0001

---------- Setting up symmetries ----------

Found 48 point-group symmetries of the bravais lattice
Found 48 space-group symmetries with basis
Applied RMS atom displacement 6.99182e-17 bohrs to make symmetries exact.
Applied RMS atom displacement 1.55737e-06 bohrs to make symmetries exact.

---------- Initializing the Grid ----------
R =
[           12            0            0  ]
[            0           12            0  ]
[            0            0           12  ]
unit cell volume = 1728
G =
[   0.523599          0          0  ]
[          0   0.523599          0  ]
[          0          0   0.523599  ]
Minimum fftbox size, Smin = [  56  56  56  ]
Chosen fftbox size, S = [  56  56  56  ]

---------- Initializing tighter grid for wavefunction operations ----------
R =
[           12            0            0  ]
[            0           12            0  ]
[            0            0           12  ]
unit cell volume = 1728
G =
[   0.523599          0          0  ]
[          0   0.523599          0  ]
[          0          0   0.523599  ]
Minimum fftbox size, Smin = [  52  52  52  ]
Chosen fftbox size, S = [  54  54  54  ]

---------- Exchange Correlation functional ----------
Initalized PBE GGA exchange.
Initalized PBE GGA correlation.

---------- Setting up pseudopotentials ----------
Width of ionic core gaussian charges (only for fluid interactions / plotting) set to 0

Reading pseudopotential file '/home/cbu/jdftx/build/pseudopotentials/GBRV/h_pbe.uspp':
  Title: H.  Created by USPP 7.3.6 on 2-4-15
  Reference state energy: -0.458849.  1 valence electrons in orbitals:
    |100>  occupation: 1  eigenvalue: -0.238595
  lMax: 0  lLocal: 1  QijEcut: 6
  2 projectors sampled on a log grid with 395 points:
    l: 0  eig: -0.238595  rCut: 1.2
    l: 0  eig: 1.000000  rCut: 1.2
  Transforming local potential to a uniform radial grid of dG=0.02 with 1275 points.
  Transforming nonlocal projectors to a uniform radial grid of dG=0.02 with 432 points.
  Transforming density augmentations to a uniform radial grid of dG=0.02 with 1275 points.
  Transforming atomic orbitals to a uniform radial grid of dG=0.02 with 432 points.
  Core radius for overlap checks: 1.20 bohrs.

Initialized 1 species with 2 total atoms.

ERROR: Ions H #0 and H #1 are on top of eachother.

End date and time: Wed May  8 20:51:20 2024  (Duration: 0-0:00:01.65)
Failed.
cbu@polaris-login-01:~/jdftx/build/test/vibrations>
shankar1729 commented 4 months ago

No, the geometry optimization actually overlapped the two H atoms! This is the worst sort of error: it is actually running and leading to wrong results.

Can you try if this build at least tests correctly on the cpu?

ColinBundschu commented 4 months ago

Yes, sorry I updated my comment above after reading it more carefully. Here are the openShell results from the CPU, which also show the same type of error with the O2 electronic energy without minimization. Note the magnitude of the forces on the O2 in the z direction!

cbu@polaris-login-01:~/jdftx/build/test/openShell> cat results
                    Check name       Obtained value      Expected value Status
                 H energy [Eh]: -5.009176671362e-01 -5.009200000000e-01 [Passed]
       H magnetic moment [muB]:  9.970000000000e-01  9.970000000000e-01 [Passed]
                O2 energy [Eh]:  3.545032644716e+03 -3.202300000000e+01 [FAILED]
 O magnetic moment in O2 [muB]:  9.950000000000e-01  9.950000000000e-01 [Passed]
cbu@polaris-login-01:~/jdftx/build/test/openShell> cat O2.out

*************** JDFTx 1.7.0 (git hash 19235885) ***************

Start date and time: Wed May  8 21:05:17 2024
Executable /home/cbu/jdftx/build/jdftx with command-line: -i /home/cbu/jdftx/jdftx-git/jdftx/test/openShell/O2.in -d -o O2.out
Running on hosts (process indices):  x3205c0s37b0n0 (0)
Divided in process groups (process indices):  0 (0)
Resource initialization completed at t[s]:      0.00
Run totals: 1 processes, 32 threads, 0 GPUs

Input parsed successfully to the following command list (including defaults):

basis kpoint-dependent
coords-type Cartesian
core-overlap-check vector
coulomb-interaction Isolated
coulomb-truncation-embed 0 0 0
davidson-band-ratio 1.1
dump End None
dump-name $INPUT.$VAR
elec-cutoff 20 100
elec-eigen-algo Davidson
elec-ex-corr gga-PBE
elec-initial-magnetization 2.000000 yes
electronic-minimize  \
        dirUpdateScheme      FletcherReeves \
        linminMethod         DirUpdateRecommended \
        nIterations          100 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  1e-08 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
exchange-regularization None
fluid None
fluid-ex-corr lda-TF lda-PZ
fluid-gummel-loop 10 1.000000e-05
fluid-minimize  \
        dirUpdateScheme      PolakRibiere \
        linminMethod         DirUpdateRecommended \
        nIterations          100 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  0 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
fluid-solvent H2O 55.338 ScalarEOS \
        epsBulk 78.4 \
        pMol 0.92466 \
        epsInf 1.77 \
        Pvap 1.06736e-10 \
        sigmaBulk 4.62e-05 \
        Rvdw 2.61727 \
        Res 1.42 \
        tauNuc 343133 \
        poleEl 15 7 1
forces-output-coords Positions
ion O   0.000000000000000   0.000000000000000   1.140000000000000 1
ion O   0.000000000000000   0.000000000000000  -1.140000000000000 1
ion-species GBRV/$ID_pbe.uspp
ion-width 0
ionic-minimize  \
        dirUpdateScheme      L-BFGS \
        linminMethod         DirUpdateRecommended \
        nIterations          0 \
        history              15 \
        knormThreshold       0.0001 \
        maxThreshold         no \
        energyDiffThreshold  1e-06 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
kpoint   0.000000000000   0.000000000000   0.000000000000  1.00000000000000
kpoint-folding 1 1 1
latt-move-scale 1 1 1
latt-scale 1 1 1
lattice Cubic 13
lattice-minimize  \
        dirUpdateScheme      L-BFGS \
        linminMethod         DirUpdateRecommended \
        nIterations          0 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  1e-06 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
lcao-params -1 1e-06 0.001
pcm-variant GLSSA13
perturb-minimize  \
        nIterations            0 \
        algorithm              MINRES \
        residualTol            0.0001 \
        residualDiffThreshold  0.0001 \
        CGBypass               no \
        recomputeResidual      no
spintype z-spin
subspace-rotation-factor 1 yes
symmetries automatic
symmetry-threshold 0.0001

---------- Setting up symmetries ----------

Found 48 point-group symmetries of the bravais lattice
Found 16 space-group symmetries with basis
Applied RMS atom displacement 0 bohrs to make symmetries exact.

---------- Initializing the Grid ----------
R =
[           13            0            0  ]
[            0           13            0  ]
[            0            0           13  ]
unit cell volume = 2197
G =
[   0.483322          0          0  ]
[          0   0.483322          0  ]
[          0          0   0.483322  ]
Minimum fftbox size, Smin = [  60  60  60  ]
Chosen fftbox size, S = [  60  60  60  ]

---------- Initializing tighter grid for wavefunction operations ----------
R =
[           13            0            0  ]
[            0           13            0  ]
[            0            0           13  ]
unit cell volume = 2197
G =
[   0.483322          0          0  ]
[          0   0.483322          0  ]
[          0          0   0.483322  ]
Minimum fftbox size, Smin = [  56  56  56  ]
Chosen fftbox size, S = [  56  56  56  ]

---------- Exchange Correlation functional ----------
Initalized PBE GGA exchange.
Initalized PBE GGA correlation.

---------- Setting up pseudopotentials ----------
Width of ionic core gaussian charges (only for fluid interactions / plotting) set to 0

Reading pseudopotential file '/home/cbu/jdftx/build/pseudopotentials/GBRV/o_pbe.uspp':
  Title: O.  Created by USPP 7.3.6 on 3-2-2014
  Reference state energy: -15.894388.  6 valence electrons in orbitals:
    |200>  occupation: 2  eigenvalue: -0.878823
    |210>  occupation: 4  eigenvalue: -0.332131
  lMax: 2  lLocal: 2  QijEcut: 6
  5 projectors sampled on a log grid with 511 points:
    l: 0  eig: -0.878823  rCut: 1.25
    l: 0  eig: 0.000000  rCut: 1.25
    l: 1  eig: -0.332132  rCut: 1.25
    l: 1  eig: 0.000000  rCut: 1.25
    l: 2  eig: 1.000000  rCut: 1.25
  Partial core density with radius 0.7
  Transforming core density to a uniform radial grid of dG=0.02 with 1261 points.
  Transforming local potential to a uniform radial grid of dG=0.02 with 1261 points.
  Transforming nonlocal projectors to a uniform radial grid of dG=0.02 with 432 points.
  Transforming density augmentations to a uniform radial grid of dG=0.02 with 1261 points.
  Transforming atomic orbitals to a uniform radial grid of dG=0.02 with 432 points.
  Core radius for overlap checks: 1.25 bohrs.

Initialized 1 species with 2 total atoms.

Folded 1 k-points by 1x1x1 to 1 k-points.

---------- Setting up k-points, bands, fillings ----------
No reducable k-points.
Computing the number of bands and number of electrons
Calculating initial fillings.
Turning on subspace rotations due to non-scalar fillings.
nElectrons:  12.000000   nBands: 7   nStates: 2

----- Setting up reduced wavefunction bases (one per k-point) -----
average nbasis = 9435.000 , ideal nbasis = 9385.751

---------- Setting up coulomb interaction ----------
Setting up double-sized grid for truncated Coulomb potentials:
R =
[           26            0            0  ]
[            0           26            0  ]
[            0            0           26  ]
unit cell volume = 17576
G =
[   0.241661          0          0  ]
[          0   0.241661          0  ]
[          0          0   0.241661  ]
Chosen fftbox size, S = [  120  120  120  ]
Integer grid location selected as the embedding center:
   Grid: [  0  0  0  ]
   Lattice: [  0  0  0  ]
   Cartesian: [  0  0  0  ]
Constructing Wigner-Seitz cell: 6 faces (6 quadrilaterals, 0 hexagons)
Range-separation parameter for embedded mesh potentials due to point charges: 0.587227 bohrs.
Constructing Wigner-Seitz cell: 6 faces (6 quadrilaterals, 0 hexagons)
Gaussian width for range separation: 1.3698 bohrs.
FFT grid for long-range part: [ 120 120 120 ].
Planning fourier transform ... Done.
Computing truncated long-range part in real space ... Done.
Adding short-range part in reciprocal space ... Done.

---------- Allocating electronic variables ----------
Initializing wave functions:  linear combination of atomic orbitals
O pseudo-atom occupations:   s ( 2 )  p ( 4 )
        FillingsUpdate:  mu: -0.115710922  nElectrons: 12.000000  magneticMoment: [ Abs: 2.00193  Tot: +2.00000 ]
LCAOMinimize: Iter:   0  Etot: +3545.1917519864246060  |grad|_K:  2.808e-03  alpha:  1.000e+00
        FillingsUpdate:  mu: -0.119250778  nElectrons: 12.000000  magneticMoment: [ Abs: 2.00197  Tot: +2.00000 ]
LCAOMinimize: Iter:   1  Etot: +3545.1912510776755880  |grad|_K:  3.606e-05  alpha:  7.908e-01  linmin: -3.052e-01  cgtest:  6.035e-01  t[s]:      7.80
LCAOMinimize: Encountered beta<0, resetting CG.
        FillingsUpdate:  mu: -0.119287572  nElectrons: 12.000000  magneticMoment: [ Abs: 2.00198  Tot: +2.00000 ]
LCAOMinimize: Iter:   2  Etot: +3545.1912509748740376  |grad|_K:  5.342e-06  alpha:  9.759e-01  linmin: -1.474e-03  cgtest:  2.014e-03  t[s]:      8.69
        FillingsUpdate:  mu: -0.119281326  nElectrons: 12.000000  magneticMoment: [ Abs: 2.00199  Tot: +2.00000 ]
LCAOMinimize: Iter:   3  Etot: +3545.1912509728172154  |grad|_K:  1.323e-08  alpha:  8.894e-01  linmin:  1.393e-02  cgtest: -1.512e-01  t[s]:      9.83
LCAOMinimize: Converged (|Delta Etot|<1.000000e-06 for 2 iters).

---- Citations for features of the code used in this run ----

   Software package:
      R. Sundararaman, K. Letchworth-Weaver, K.A. Schwarz, D. Gunceler, Y. Ozhabes and T.A. Arias, 'JDFTx: software for joint density-functional theory', SoftwareX 6, 278 (2017)

   gga-PBE exchange-correlation functional:
      J.P. Perdew, K. Burke and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996)

   Pseudopotentials:
      KF Garrity, JW Bennett, KM Rabe and D Vanderbilt, Comput. Mater. Sci. 81, 446 (2014)

   Truncated Coulomb potentials:
      R. Sundararaman and T.A. Arias, Phys. Rev. B 87, 165122 (2013)

   Total energy minimization:
      T.A. Arias, M.C. Payne and J.D. Joannopoulos, Phys. Rev. Lett. 69, 1077 (1992)

This list may not be complete. Please suggest additional citations or
report any other bugs at https://github.com/shankar1729/jdftx/issues

Initialization completed successfully at t[s]:     10.38

-------- Electronic minimization -----------
ElecMinimize: Iter:   0  Etot: +3545.191250972817215  |grad|_K:  1.391e-03  alpha:  1.000e+00
        SubspaceRotationAdjust: set factor to 1
ElecMinimize: Iter:   1  Etot: +3545.061263357338703  |grad|_K:  5.355e-04  alpha:  5.240e-01  linmin:  3.805e-02  t[s]:     14.84
        SubspaceRotationAdjust: set factor to 1
ElecMinimize: Iter:   2  Etot: +3545.040483737285740  |grad|_K:  2.683e-04  alpha:  5.597e-01  linmin: -5.542e-04  t[s]:     17.60
        SubspaceRotationAdjust: set factor to 1
ElecMinimize: Iter:   3  Etot: +3545.034393412536247  |grad|_K:  1.276e-04  alpha:  6.399e-01  linmin: -1.828e-03  t[s]:     20.30
        SubspaceRotationAdjust: set factor to 1
ElecMinimize: Iter:   4  Etot: +3545.033030818591214  |grad|_K:  5.704e-05  alpha:  6.326e-01  linmin:  3.149e-04  t[s]:     23.32
        SubspaceRotationAdjust: set factor to 1
ElecMinimize: Iter:   5  Etot: +3545.032730452193391  |grad|_K:  2.738e-05  alpha:  6.990e-01  linmin: -1.327e-04  t[s]:     26.29
        SubspaceRotationAdjust: set factor to 1
ElecMinimize: Iter:   6  Etot: +3545.032667175064489  |grad|_K:  1.436e-05  alpha:  6.391e-01  linmin: -7.682e-05  t[s]:     29.40
        SubspaceRotationAdjust: set factor to 1
ElecMinimize: Iter:   7  Etot: +3545.032650164574534  |grad|_K:  7.000e-06  alpha:  6.244e-01  linmin:  6.781e-05  t[s]:     32.41
        SubspaceRotationAdjust: set factor to 1
ElecMinimize: Iter:   8  Etot: +3545.032645948115260  |grad|_K:  3.256e-06  alpha:  6.515e-01  linmin: -7.455e-05  t[s]:     35.68
        SubspaceRotationAdjust: set factor to 1
ElecMinimize: Iter:   9  Etot: +3545.032645001147557  |grad|_K:  1.608e-06  alpha:  6.761e-01  linmin: -3.050e-06  t[s]:     39.18
        SubspaceRotationAdjust: set factor to 1
ElecMinimize: Iter:  10  Etot: +3545.032644776565121  |grad|_K:  7.671e-07  alpha:  6.574e-01  linmin: -7.679e-06  t[s]:     42.74
        SubspaceRotationAdjust: set factor to 1
ElecMinimize: Iter:  11  Etot: +3545.032644728481955  |grad|_K:  3.548e-07  alpha:  6.185e-01  linmin: -2.177e-06  t[s]:     46.56
        SubspaceRotationAdjust: set factor to 1.01
ElecMinimize: Iter:  12  Etot: +3545.032644718520260  |grad|_K:  1.720e-07  alpha:  5.989e-01  linmin: -5.079e-06  t[s]:     50.20
        SubspaceRotationAdjust: set factor to 1.03
ElecMinimize: Iter:  13  Etot: +3545.032644716182403  |grad|_K:  7.417e-08  alpha:  5.983e-01  linmin: -3.927e-05  t[s]:     53.63
ElecMinimize: Converged (|Delta Etot|<1.000000e-08 for 2 iters).
Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion O   0.000000000000000   0.000000000000000   1.140000000000000 1
ion O   0.000000000000000   0.000000000000000  -1.140000000000000 1

# Forces in Cartesian coordinates:
force O   0.000000000000000   0.000000000000000 1568.910336489458814 1
force O   0.000000000000000   0.000000000000000 -1568.910336489458814 1

# Energy components:
   Eewald =     3592.8451106841325782
       EH =       42.6532281943148064
     Eloc =     -102.5734810423188605
      Enl =        4.6123702922981202
      Exc =       -7.1157115621026517
 Exc_core =        0.1300954080658279
       KE =       14.4810327417921343
-------------------------------------
     Etot =     3545.0326447161824035

IonicMinimize: Iter:   0  Etot: +3545.032644716182403  |grad|_K:  9.058e+02  t[s]:     54.86
IonicMinimize: None of the convergence criteria satisfied after 0 iterations.

#--- Lowdin population analysis ---
# oxidation-state O +0.055 +0.055
# magnetic-moments O +0.995 +0.995

End date and time: Wed May  8 21:06:13 2024  (Duration: 0-0:00:55.55)
Done!
cbu@polaris-login-01:~/jdftx/build/test/openShell>
ColinBundschu commented 4 months ago

Here is the H2 geometry opt from the CPU (no MPI). It shows the same issue of the H2 nuclei not repulsing correctly. It seems they might "feel" each other still, as they want to stack (although maybe that is just a side effect of the electronic minimization wanting them to be closer?). But it appears they suffer no coulombic penalty from being this close.

cbu@polaris-login-01:~/jdftx/build/test/vibrations> cat H2_geometry.out

*************** JDFTx 1.7.0 (git hash 19235885) ***************

Start date and time: Wed May  8 21:06:14 2024
Executable /home/cbu/jdftx/build/jdftx with command-line: -i /home/cbu/jdftx/jdftx-git/jdftx/test/vibrations/H2_geometry.in -d -o H2_geometry.out
Running on hosts (process indices):  x3205c0s37b0n0 (0)
Divided in process groups (process indices):  0 (0)
Resource initialization completed at t[s]:      0.00
Run totals: 1 processes, 32 threads, 0 GPUs

Input parsed successfully to the following command list (including defaults):

basis kpoint-dependent
coords-type Cartesian
core-overlap-check none
coulomb-interaction Isolated
coulomb-truncation-embed 0 5.95 0.05
davidson-band-ratio 1.1
dump End None IonicPositions
dump
dump-name H2.$VAR
elec-cutoff 20 100
elec-eigen-algo Davidson
elec-ex-corr gga-PBE
electronic-minimize  \
        dirUpdateScheme      FletcherReeves \
        linminMethod         DirUpdateRecommended \
        nIterations          100 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  1e-08 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
electronic-scf  \
        nIterations     50 \
        energyDiffThreshold     1e-08 \
        residualThreshold       1e-07 \
        mixFraction     0.5 \
        qMetric 0.8 \
        history 10 \
        nEigSteps       2 \
        eigDiffThreshold        1e-08 \
        mixedVariable   Density \
        qKerker 0.8 \
        qKappa  -1 \
        verbose no \
        mixFractionMag  1.5
exchange-regularization None
fluid None
fluid-ex-corr lda-TF lda-PZ
fluid-gummel-loop 10 1.000000e-05
fluid-minimize  \
        dirUpdateScheme      PolakRibiere \
        linminMethod         DirUpdateRecommended \
        nIterations          100 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  0 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
fluid-solvent H2O 55.338 ScalarEOS \
        epsBulk 78.4 \
        pMol 0.92466 \
        epsInf 1.77 \
        Pvap 1.06736e-10 \
        sigmaBulk 4.62e-05 \
        Rvdw 2.61727 \
        Res 1.42 \
        tauNuc 343133 \
        poleEl 15 7 1
forces-output-coords Positions
ion H   0.000000000000000   6.100000000000000   0.700000000000000 1
ion H   0.000000000000000   5.799999999999999  -0.600000000000000 1
ion-species GBRV/$ID_pbe.uspp
ion-width 0
ionic-minimize  \
        dirUpdateScheme      L-BFGS \
        linminMethod         DirUpdateRecommended \
        nIterations          10 \
        history              15 \
        knormThreshold       0.0001 \
        maxThreshold         no \
        energyDiffThreshold  1e-06 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
kpoint   0.000000000000   0.000000000000   0.000000000000  1.00000000000000
kpoint-folding 1 1 1
latt-move-scale 1 1 1
latt-scale 1 1 1
lattice Cubic 12
lattice-minimize  \
        dirUpdateScheme      L-BFGS \
        linminMethod         DirUpdateRecommended \
        nIterations          0 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  1e-06 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
lcao-params -1 1e-06 0.001
pcm-variant GLSSA13
perturb-minimize  \
        nIterations            0 \
        algorithm              MINRES \
        residualTol            0.0001 \
        residualDiffThreshold  0.0001 \
        CGBypass               no \
        recomputeResidual      no
spintype no-spin
subspace-rotation-factor 1 yes
symmetries automatic
symmetry-threshold 0.0001

---------- Setting up symmetries ----------

Found 48 point-group symmetries of the bravais lattice
Found 4 space-group symmetries with basis
Applied RMS atom displacement 0 bohrs to make symmetries exact.

---------- Initializing the Grid ----------
R =
[           12            0            0  ]
[            0           12            0  ]
[            0            0           12  ]
unit cell volume = 1728
G =
[   0.523599          0          0  ]
[          0   0.523599          0  ]
[          0          0   0.523599  ]
Minimum fftbox size, Smin = [  56  56  56  ]
Chosen fftbox size, S = [  56  56  56  ]

---------- Initializing tighter grid for wavefunction operations ----------
R =
[           12            0            0  ]
[            0           12            0  ]
[            0            0           12  ]
unit cell volume = 1728
G =
[   0.523599          0          0  ]
[          0   0.523599          0  ]
[          0          0   0.523599  ]
Minimum fftbox size, Smin = [  52  52  52  ]
Chosen fftbox size, S = [  54  54  54  ]

---------- Exchange Correlation functional ----------
Initalized PBE GGA exchange.
Initalized PBE GGA correlation.

---------- Setting up pseudopotentials ----------
Width of ionic core gaussian charges (only for fluid interactions / plotting) set to 0

Reading pseudopotential file '/home/cbu/jdftx/build/pseudopotentials/GBRV/h_pbe.uspp':
  Title: H.  Created by USPP 7.3.6 on 2-4-15
  Reference state energy: -0.458849.  1 valence electrons in orbitals:
    |100>  occupation: 1  eigenvalue: -0.238595
  lMax: 0  lLocal: 1  QijEcut: 6
  2 projectors sampled on a log grid with 395 points:
    l: 0  eig: -0.238595  rCut: 1.2
    l: 0  eig: 1.000000  rCut: 1.2
  Transforming local potential to a uniform radial grid of dG=0.02 with 1275 points.
  Transforming nonlocal projectors to a uniform radial grid of dG=0.02 with 432 points.
  Transforming density augmentations to a uniform radial grid of dG=0.02 with 1275 points.
  Transforming atomic orbitals to a uniform radial grid of dG=0.02 with 432 points.
  Core radius for overlap checks: 1.20 bohrs.

Initialized 1 species with 2 total atoms.

Folded 1 k-points by 1x1x1 to 1 k-points.

---------- Setting up k-points, bands, fillings ----------
No reducable k-points.
Computing the number of bands and number of electrons
Calculating initial fillings.
nElectrons:   2.000000   nBands: 1   nStates: 1

----- Setting up reduced wavefunction bases (one per k-point) -----
average nbasis = 7249.000 , ideal nbasis = 7382.148

---------- Setting up coulomb interaction ----------
Setting up double-sized grid for truncated Coulomb potentials:
R =
[           24            0            0  ]
[            0           24            0  ]
[            0            0           24  ]
unit cell volume = 13824
G =
[   0.261799          0          0  ]
[          0   0.261799          0  ]
[          0          0   0.261799  ]
Chosen fftbox size, S = [  112  112  112  ]
Integer grid location selected as the embedding center:
   Grid: [  0  28  0  ]
   Lattice: [  0  0.495833  0.00416667  ]
   Cartesian: [  0  5.95  0.05  ]
Constructing Wigner-Seitz cell: 6 faces (6 quadrilaterals, 0 hexagons)
Range-separation parameter for embedded mesh potentials due to point charges: 0.583992 bohrs.
Constructing Wigner-Seitz cell: 6 faces (6 quadrilaterals, 0 hexagons)
Gaussian width for range separation: 1.26443 bohrs.
FFT grid for long-range part: [ 112 112 112 ].
Planning fourier transform ... Done.
Computing truncated long-range part in real space ... Done.
Adding short-range part in reciprocal space ... Done.

---------- Allocating electronic variables ----------
Initializing wave functions:  linear combination of atomic orbitals
H pseudo-atom occupations:   s ( 1 )
        FillingsUpdate:  mu: -0.000000000  nElectrons: 2.000000
LCAOMinimize: Iter:   0  Etot: +3.0676389757835745  |grad|_K:  8.335e-05  alpha:  1.000e+00
        FillingsUpdate:  mu: -0.000000000  nElectrons: 2.000000
LCAOMinimize: Iter:   1  Etot: +3.0676389443363399  |grad|_K:  2.277e-07  alpha:  1.006e+00  linmin:  4.591e-02  cgtest: -9.881e-01  t[s]:      4.11
        FillingsUpdate:  mu: -0.000000000  nElectrons: 2.000000
LCAOMinimize: Iter:   2  Etot: +3.0676389443363359  |grad|_K:  2.265e-07  alpha:  9.242e-01  linmin:  2.414e-04  cgtest:  9.925e-01  t[s]:      4.58
LCAOMinimize: Converged (|Delta Etot|<1.000000e-06 for 2 iters).

---- Citations for features of the code used in this run ----

   Software package:
      R. Sundararaman, K. Letchworth-Weaver, K.A. Schwarz, D. Gunceler, Y. Ozhabes and T.A. Arias, 'JDFTx: software for joint density-functional theory', SoftwareX 6, 278 (2017)

   gga-PBE exchange-correlation functional:
      J.P. Perdew, K. Burke and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996)

   Pseudopotentials:
      KF Garrity, JW Bennett, KM Rabe and D Vanderbilt, Comput. Mater. Sci. 81, 446 (2014)

   Truncated Coulomb potentials:
      R. Sundararaman and T.A. Arias, Phys. Rev. B 87, 165122 (2013)

This list may not be complete. Please suggest additional citations or
report any other bugs at https://github.com/shankar1729/jdftx/issues

Initialization completed successfully at t[s]:      4.60

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: +3.001022364021621   dEtot: -6.662e-02   |Residual|: 1.684e-01   |deigs|: 5.044e-02  t[s]:      5.35
SCF: Cycle:  1   Etot: +2.995524337834684   dEtot: -5.498e-03   |Residual|: 8.714e-02   |deigs|: 5.164e-02  t[s]:      6.00
SCF: Cycle:  2   Etot: +2.992690883731126   dEtot: -2.833e-03   |Residual|: 1.701e-02   |deigs|: 6.453e-02  t[s]:      6.68
SCF: Cycle:  3   Etot: +2.992732618678881   dEtot: +4.173e-05   |Residual|: 7.439e-03   |deigs|: 2.128e-03  t[s]:      7.30
SCF: Cycle:  4   Etot: +2.992673040621607   dEtot: -5.958e-05   |Residual|: 2.900e-03   |deigs|: 1.232e-02  t[s]:      7.98
SCF: Cycle:  5   Etot: +2.992673805893074   dEtot: +7.653e-07   |Residual|: 2.463e-03   |deigs|: 2.555e-03  t[s]:      8.68
SCF: Cycle:  6   Etot: +2.992669787423745   dEtot: -4.018e-06   |Residual|: 5.919e-03   |deigs|: 5.033e-03  t[s]:      9.36
SCF: Cycle:  7   Etot: +2.992662632808877   dEtot: -7.155e-06   |Residual|: 5.619e-04   |deigs|: 3.031e-03  t[s]:     10.03
SCF: Cycle:  8   Etot: +2.992661612069269   dEtot: -1.021e-06   |Residual|: 2.583e-04   |deigs|: 9.419e-05  t[s]:     10.57
SCF: Cycle:  9   Etot: +2.992661527326435   dEtot: -8.474e-08   |Residual|: 6.159e-05   |deigs|: 3.186e-04  t[s]:     11.12
SCF: Cycle: 10   Etot: +2.992661533087936   dEtot: +5.762e-09   |Residual|: 2.467e-04   |deigs|: 3.075e-04  t[s]:     11.78
SCF: Cycle: 11   Etot: +2.992661502215527   dEtot: -3.087e-08   |Residual|: 1.269e-05   |deigs|: 2.406e-04  t[s]:     12.46
SCF: Cycle: 12   Etot: +2.992661501711754   dEtot: -5.038e-10   |Residual|: 2.025e-05   |deigs|: 2.497e-05  t[s]:     13.04
SCF: Cycle: 13   Etot: +2.992661501667123   dEtot: -4.463e-11   |Residual|: 1.542e-05   |deigs|: 8.760e-06  t[s]:     13.58
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.100000000000000   0.700000000000000 1
ion H   0.000000000000000   5.799999999999999  -0.600000000000000 1

# Forces in Cartesian coordinates:
force H   0.000000000000000  16.235121214909555  -0.502078530328887 1
force H   0.000000000000000 -16.235121214909555   0.502078530328887 1

# Energy components:
   Eewald =        4.9052719092942318
       EH =        1.3330816824272471
     Eloc =       -3.4393520294085032
      Enl =       -0.0591448788735895
      Exc =       -0.6998246819972688
       KE =        0.9526295002250056
-------------------------------------
     Etot =        2.9926615016671230

IonicMinimize: Iter:   0  Etot: +2.992661501667123  |grad|_K:  9.378e+00  t[s]:     13.74

#--- Lowdin population analysis ---
# oxidation-state H +0.044 +0.044

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: +1.057960672927052   dEtot: -2.987e-03   |Residual|: 1.994e-02   |deigs|: 1.533e-03  t[s]:     14.54
SCF: Cycle:  1   Etot: +1.057917887636574   dEtot: -4.279e-05   |Residual|: 1.043e-02   |deigs|: 3.378e-03  t[s]:     15.18
SCF: Cycle:  2   Etot: +1.057908810728909   dEtot: -9.077e-06   |Residual|: 1.457e-03   |deigs|: 3.288e-03  t[s]:     15.83
SCF: Cycle:  3   Etot: +1.057908490715371   dEtot: -3.200e-07   |Residual|: 4.847e-04   |deigs|: 9.759e-05  t[s]:     16.33
SCF: Cycle:  4   Etot: +1.057908195206598   dEtot: -2.955e-07   |Residual|: 1.897e-04   |deigs|: 8.395e-05  t[s]:     17.01
SCF: Cycle:  5   Etot: +1.057907906705714   dEtot: -2.885e-07   |Residual|: 6.872e-05   |deigs|: 2.326e-04  t[s]:     17.66
SCF: Cycle:  6   Etot: +1.057907841886946   dEtot: -6.482e-08   |Residual|: 4.184e-05   |deigs|: 5.429e-05  t[s]:     18.18
SCF: Cycle:  7   Etot: +1.057907821007718   dEtot: -2.088e-08   |Residual|: 1.894e-05   |deigs|: 2.169e-05  t[s]:     18.71
SCF: Cycle:  8   Etot: +1.057907812013788   dEtot: -8.994e-09   |Residual|: 7.219e-06   |deigs|: 1.680e-05  t[s]:     19.36
SCF: Cycle:  9   Etot: +1.057907811133393   dEtot: -8.804e-10   |Residual|: 3.257e-06   |deigs|: 3.593e-08  t[s]:     19.89
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.199952215080698   0.696908932148609 1
ion H   0.000000000000000   5.700047784919301  -0.596908932148610 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   5.708908697970661  -0.465013947610292 1
force H   0.000000000000000  -5.708908697970661   0.465013947610292 1

# Energy components:
   Eewald =        2.9437201483758253
       EH =        1.3138759142886942
     Eloc =       -3.3842664318667470
      Enl =       -0.0583875355543376
      Exc =       -0.6908537875028491
       KE =        0.9338195033928072
-------------------------------------
     Etot =        1.0579078111333928

IonicMinimize: Iter:   1  Etot: +1.057907811133393  |grad|_K:  3.307e+00  alpha:  6.157e-03  linmin: -9.987e-01  t[s]:     20.05

#--- Lowdin population analysis ---
# oxidation-state H +0.042 +0.042

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: +0.548187275097707   dEtot: -9.427e-04   |Residual|: 1.149e-02   |deigs|: 4.881e-04  t[s]:     20.80
SCF: Cycle:  1   Etot: +0.548172432191989   dEtot: -1.484e-05   |Residual|: 5.974e-03   |deigs|: 2.012e-03  t[s]:     21.43
SCF: Cycle:  2   Etot: +0.548169009941225   dEtot: -3.422e-06   |Residual|: 8.387e-04   |deigs|: 2.001e-03  t[s]:     22.08
SCF: Cycle:  3   Etot: +0.548168939243130   dEtot: -7.070e-08   |Residual|: 2.763e-04   |deigs|: 6.559e-05  t[s]:     22.59
SCF: Cycle:  4   Etot: +0.548168837573050   dEtot: -1.017e-07   |Residual|: 1.115e-04   |deigs|: 6.030e-05  t[s]:     23.28
SCF: Cycle:  5   Etot: +0.548168723774968   dEtot: -1.138e-07   |Residual|: 4.255e-05   |deigs|: 1.356e-04  t[s]:     23.98
SCF: Cycle:  6   Etot: +0.548168689877788   dEtot: -3.390e-08   |Residual|: 2.673e-05   |deigs|: 3.946e-05  t[s]:     24.49
SCF: Cycle:  7   Etot: +0.548168677693155   dEtot: -1.218e-08   |Residual|: 1.156e-05   |deigs|: 9.514e-06  t[s]:     25.02
SCF: Cycle:  8   Etot: +0.548168673698600   dEtot: -3.995e-09   |Residual|: 4.183e-06   |deigs|: 7.137e-06  t[s]:     25.72
SCF: Cycle:  9   Etot: +0.548168673444504   dEtot: -2.541e-10   |Residual|: 1.873e-06   |deigs|: 1.541e-06  t[s]:     26.28
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.254277211404604   0.691000233976190 1
ion H   0.000000000000000   5.645722788595394  -0.591000233976190 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   3.764111657820549  -0.441334055635372 1
force H   0.000000000000000  -3.764111657820549   0.441334055635372 1

# Energy components:
   Eewald =        2.4181534264328088
       EH =        1.3024948943928354
     Eloc =       -3.3518258716629559
      Enl =       -0.0578331823405330
      Exc =       -0.6855581019624856
       KE =        0.9227375085848341
-------------------------------------
     Etot =        0.5481686734445037

IonicMinimize: Iter:   2  Etot: +0.548168673444504  |grad|_K:  2.188e+00  alpha:  1.000e+00  linmin: -1.000e+00  t[s]:     26.43

#--- Lowdin population analysis ---
# oxidation-state H +0.040 +0.040

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -0.011639147990254   dEtot: -3.221e-03   |Residual|: 2.094e-02   |deigs|: 1.665e-03  t[s]:     27.20
SCF: Cycle:  1   Etot: -0.011689036275879   dEtot: -4.989e-05   |Residual|: 1.095e-02   |deigs|: 3.589e-03  t[s]:     27.79
SCF: Cycle:  2   Etot: -0.011699688276583   dEtot: -1.065e-05   |Residual|: 1.524e-03   |deigs|: 3.532e-03  t[s]:     28.44
SCF: Cycle:  3   Etot: -0.011700053026990   dEtot: -3.648e-07   |Residual|: 5.299e-04   |deigs|: 1.301e-04  t[s]:     28.94
SCF: Cycle:  4   Etot: -0.011700470951148   dEtot: -4.179e-07   |Residual|: 2.118e-04   |deigs|: 1.056e-04  t[s]:     29.61
SCF: Cycle:  5   Etot: -0.011700933534141   dEtot: -4.626e-07   |Residual|: 8.477e-05   |deigs|: 2.554e-04  t[s]:     30.31
SCF: Cycle:  6   Etot: -0.011701058212031   dEtot: -1.247e-07   |Residual|: 5.960e-05   |deigs|: 9.502e-05  t[s]:     30.85
SCF: Cycle:  7   Etot: -0.011701090457999   dEtot: -3.225e-08   |Residual|: 2.288e-05   |deigs|: 1.619e-06  t[s]:     31.39
SCF: Cycle:  8   Etot: -0.011701101487275   dEtot: -1.103e-08   |Residual|: 1.084e-05   |deigs|: 8.957e-06  t[s]:     32.06
SCF: Cycle:  9   Etot: -0.011701102125165   dEtot: -6.379e-10   |Residual|: 3.972e-06   |deigs|: 4.720e-06  t[s]:     32.59
SCF: Cycle: 10   Etot: -0.011701102492455   dEtot: -3.673e-10   |Residual|: 2.863e-06   |deigs|: 1.718e-06  t[s]:     33.32
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.352080375181200   0.670154583223852 1
ion H   0.000000000000000   5.547919624818800  -0.570154583223852 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   2.019633488601614  -0.394805962872354 1
force H   0.000000000000000  -2.019633488601614   0.394805962872354 1

# Energy components:
   Eewald =        1.8299538334719376
       EH =        1.2820650335898736
     Eloc =       -3.2939703556221800
      Enl =       -0.0566654214571784
      Exc =       -0.6760922701903728
       KE =        0.9030080777154647
-------------------------------------
     Etot =       -0.0117011024924553

IonicMinimize: Iter:   3  Etot: -0.011701102492455  |grad|_K:  1.188e+00  alpha:  9.208e-01  linmin: -9.999e-01  t[s]:     33.47

#--- Lowdin population analysis ---
# oxidation-state H +0.038 +0.038

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -0.331610470056635   dEtot: -3.235e-03   |Residual|: 2.024e-02   |deigs|: 1.655e-03  t[s]:     34.22
SCF: Cycle:  1   Etot: -0.331652068138319   dEtot: -4.160e-05   |Residual|: 1.075e-02   |deigs|: 3.211e-03  t[s]:     34.82
SCF: Cycle:  2   Etot: -0.331660806088798   dEtot: -8.738e-06   |Residual|: 1.445e-03   |deigs|: 3.250e-03  t[s]:     35.46
SCF: Cycle:  3   Etot: -0.331661149867123   dEtot: -3.438e-07   |Residual|: 5.339e-04   |deigs|: 8.252e-05  t[s]:     35.95
SCF: Cycle:  4   Etot: -0.331661613856490   dEtot: -4.640e-07   |Residual|: 2.184e-04   |deigs|: 9.511e-05  t[s]:     36.63
SCF: Cycle:  5   Etot: -0.331662194318907   dEtot: -5.805e-07   |Residual|: 9.081e-05   |deigs|: 2.484e-04  t[s]:     37.33
SCF: Cycle:  6   Etot: -0.331662363512865   dEtot: -1.692e-07   |Residual|: 5.640e-05   |deigs|: 9.580e-05  t[s]:     37.86
SCF: Cycle:  7   Etot: -0.331662393849458   dEtot: -3.034e-08   |Residual|: 2.371e-05   |deigs|: 7.690e-06  t[s]:     38.41
SCF: Cycle:  8   Etot: -0.331662404606376   dEtot: -1.076e-08   |Residual|: 8.222e-06   |deigs|: 9.657e-06  t[s]:     39.11
SCF: Cycle:  9   Etot: -0.331662405595736   dEtot: -9.894e-10   |Residual|: 3.974e-06   |deigs|: 1.983e-07  t[s]:     39.61
SCF: Cycle: 10   Etot: -0.331662405853678   dEtot: -2.579e-10   |Residual|: 1.616e-06   |deigs|: 2.506e-06  t[s]:     40.30
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.445607393654418   0.634761125982290 1
ion H   0.000000000000000   5.454392606345580  -0.534761125982290 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   1.204065668996853  -0.346547177751972 1
force H   0.000000000000000  -1.204065668996853   0.346547177751972 1

# Energy components:
   Eewald =        1.4846191552225683
       EH =        1.2636956528320351
     Eloc =       -3.2423882037480567
      Enl =       -0.0554533535441530
      Exc =       -0.6676283646159155
       KE =        0.8854927079998440
-------------------------------------
     Etot =       -0.3316624058536777

IonicMinimize: Iter:   4  Etot: -0.331662405853678  |grad|_K:  7.234e-01  alpha:  8.010e-01  linmin: -9.967e-01  t[s]:     40.46

#--- Lowdin population analysis ---
# oxidation-state H +0.036 +0.036

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -0.530608990061598   dEtot: -3.174e-03   |Residual|: 1.894e-02   |deigs|: 1.596e-03  t[s]:     41.13
SCF: Cycle:  1   Etot: -0.530637285769816   dEtot: -2.830e-05   |Residual|: 1.028e-02   |deigs|: 2.546e-03  t[s]:     41.70
SCF: Cycle:  2   Etot: -0.530642808294719   dEtot: -5.523e-06   |Residual|: 1.289e-03   |deigs|: 2.678e-03  t[s]:     42.28
SCF: Cycle:  3   Etot: -0.530643106183772   dEtot: -2.979e-07   |Residual|: 5.277e-04   |deigs|: 1.240e-05  t[s]:     42.78
SCF: Cycle:  4   Etot: -0.530643600405834   dEtot: -4.942e-07   |Residual|: 2.186e-04   |deigs|: 6.687e-05  t[s]:     43.49
SCF: Cycle:  5   Etot: -0.530644247931101   dEtot: -6.475e-07   |Residual|: 9.332e-05   |deigs|: 2.251e-04  t[s]:     44.13
SCF: Cycle:  6   Etot: -0.530644446615584   dEtot: -1.987e-07   |Residual|: 3.731e-05   |deigs|: 6.257e-05  t[s]:     44.80
SCF: Cycle:  7   Etot: -0.530644460176434   dEtot: -1.356e-08   |Residual|: 2.078e-05   |deigs|: 9.895e-06  t[s]:     45.31
SCF: Cycle:  8   Etot: -0.530644473881554   dEtot: -1.371e-08   |Residual|: 7.420e-06   |deigs|: 3.827e-06  t[s]:     45.98
SCF: Cycle:  9   Etot: -0.530644474543292   dEtot: -6.617e-10   |Residual|: 5.411e-06   |deigs|: 2.422e-08  t[s]:     46.47
SCF: Cycle: 10   Etot: -0.530644474625181   dEtot: -8.189e-11   |Residual|: 1.042e-06   |deigs|: 4.555e-07  t[s]:     47.00
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.531457516672810   0.583479975666340 1
ion H   0.000000000000000   5.368542483327190  -0.483479975666340 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   0.763007229582742  -0.298313954914643 1
force H   0.000000000000000  -0.763007229582742   0.298313954914643 1

# Energy components:
   Eewald =        1.2654200801944155
       EH =        1.2490099486727879
     Eloc =       -3.2014604334311789
      Enl =       -0.0543935190205748
      Exc =       -0.6608962035631615
       KE =        0.8716756525225301
-------------------------------------
     Etot =       -0.5306444746251814

IonicMinimize: Iter:   5  Etot: -0.530644474625181  |grad|_K:  4.730e-01  alpha:  5.805e-01  linmin: -9.863e-01  t[s]:     47.15

#--- Lowdin population analysis ---
# oxidation-state H +0.034 +0.034

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -0.660918558513302   dEtot: -3.126e-03   |Residual|: 1.795e-02   |deigs|: 1.551e-03  t[s]:     47.90
SCF: Cycle:  1   Etot: -0.660937877895230   dEtot: -1.932e-05   |Residual|: 9.906e-03   |deigs|: 1.947e-03  t[s]:     48.54
SCF: Cycle:  2   Etot: -0.660941062625289   dEtot: -3.185e-06   |Residual|: 1.143e-03   |deigs|: 2.097e-03  t[s]:     49.22
SCF: Cycle:  3   Etot: -0.660941312004484   dEtot: -2.494e-07   |Residual|: 5.257e-04   |deigs|: 2.258e-05  t[s]:     49.72
SCF: Cycle:  4   Etot: -0.660941811639573   dEtot: -4.996e-07   |Residual|: 2.141e-04   |deigs|: 3.811e-05  t[s]:     50.41
SCF: Cycle:  5   Etot: -0.660942441232812   dEtot: -6.296e-07   |Residual|: 9.404e-05   |deigs|: 1.950e-04  t[s]:     51.11
SCF: Cycle:  6   Etot: -0.660942604964606   dEtot: -1.637e-07   |Residual|: 3.460e-05   |deigs|: 2.314e-05  t[s]:     51.67
SCF: Cycle:  7   Etot: -0.660942642228513   dEtot: -3.726e-08   |Residual|: 2.211e-05   |deigs|: 2.308e-05  t[s]:     52.23
SCF: Cycle:  8   Etot: -0.660942652712786   dEtot: -1.048e-08   |Residual|: 9.808e-06   |deigs|: 1.067e-05  t[s]:     52.78
SCF: Cycle:  9   Etot: -0.660942655015017   dEtot: -2.302e-09   |Residual|: 3.858e-06   |deigs|: 3.468e-06  t[s]:     53.33
SCF: Cycle: 10   Etot: -0.660942655200511   dEtot: -1.855e-10   |Residual|: 1.251e-06   |deigs|: 2.084e-06  t[s]:     53.88
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.607179560085548   0.518164034722345 1
ion H   0.000000000000000   5.292820439914451  -0.418164034722345 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   0.500607694713009  -0.250215023753498 1
force H   0.000000000000000  -0.500607694713009   0.250215023753498 1

# Energy components:
   Eewald =        1.1196146316804008
       EH =        1.2377124627129756
     Eloc =       -3.1701724986586353
      Enl =       -0.0535330272944162
      Exc =       -0.6557395322785071
       KE =        0.8611753086376718
-------------------------------------
     Etot =       -0.6609426552005105

IonicMinimize: Iter:   6  Etot: -0.660942655200511  |grad|_K:  3.231e-01  alpha:  4.523e-01  linmin: -9.693e-01  t[s]:     54.04

#--- Lowdin population analysis ---
# oxidation-state H +0.032 +0.032

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -0.749018819377426   dEtot: -3.122e-03   |Residual|: 1.756e-02   |deigs|: 1.540e-03  t[s]:     54.79
SCF: Cycle:  1   Etot: -0.749034830866269   dEtot: -1.601e-05   |Residual|: 9.764e-03   |deigs|: 1.629e-03  t[s]:     55.45
SCF: Cycle:  2   Etot: -0.749037100521864   dEtot: -2.270e-06   |Residual|: 1.066e-03   |deigs|: 1.751e-03  t[s]:     56.09
SCF: Cycle:  3   Etot: -0.749037306118473   dEtot: -2.056e-07   |Residual|: 5.263e-04   |deigs|: 2.360e-05  t[s]:     56.60
SCF: Cycle:  4   Etot: -0.749037767766463   dEtot: -4.616e-07   |Residual|: 2.067e-04   |deigs|: 2.533e-05  t[s]:     57.25
SCF: Cycle:  5   Etot: -0.749038319562094   dEtot: -5.518e-07   |Residual|: 9.136e-05   |deigs|: 1.715e-04  t[s]:     57.98
SCF: Cycle:  6   Etot: -0.749038451082488   dEtot: -1.315e-07   |Residual|: 3.497e-05   |deigs|: 1.194e-05  t[s]:     58.49
SCF: Cycle:  7   Etot: -0.749038498654403   dEtot: -4.757e-08   |Residual|: 2.171e-05   |deigs|: 2.477e-05  t[s]:     59.02
SCF: Cycle:  8   Etot: -0.749038510628632   dEtot: -1.197e-08   |Residual|: 8.373e-06   |deigs|: 9.113e-06  t[s]:     59.54
SCF: Cycle:  9   Etot: -0.749038511886501   dEtot: -1.258e-09   |Residual|: 4.134e-06   |deigs|: 2.041e-06  t[s]:     60.08
SCF: Cycle: 10   Etot: -0.749038511989444   dEtot: -1.029e-10   |Residual|: 1.108e-06   |deigs|: 3.069e-06  t[s]:     60.66
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.672356153683362   0.442322017034355 1
ion H   0.000000000000000   5.227643846316637  -0.342322017034355 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   0.333450082166321  -0.201832249917208 1
force H   0.000000000000000  -0.333450082166321   0.201832249917208 1

# Energy components:
   Eewald =        1.0185940967557534
       EH =        1.2282708946122756
     Eloc =       -3.1441633486007210
      Enl =       -0.0527889291591152
      Exc =       -0.6514455829684884
       KE =        0.8524943573708518
-------------------------------------
     Etot =       -0.7490385119894437

IonicMinimize: Iter:   7  Etot: -0.749038511989444  |grad|_K:  2.250e-01  alpha:  3.825e-01  linmin: -9.503e-01  t[s]:     60.82

#--- Lowdin population analysis ---
# oxidation-state H +0.031 +0.031

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -0.809233906886505   dEtot: -3.163e-03   |Residual|: 1.768e-02   |deigs|: 1.561e-03  t[s]:     61.63
SCF: Cycle:  1   Etot: -0.809250706610626   dEtot: -1.680e-05   |Residual|: 9.834e-03   |deigs|: 1.649e-03  t[s]:     62.27
SCF: Cycle:  2   Etot: -0.809253180365753   dEtot: -2.474e-06   |Residual|: 1.076e-03   |deigs|: 1.768e-03  t[s]:     62.95
SCF: Cycle:  3   Etot: -0.809253379612826   dEtot: -1.992e-07   |Residual|: 5.333e-04   |deigs|: 1.318e-05  t[s]:     63.42
SCF: Cycle:  4   Etot: -0.809253808759724   dEtot: -4.291e-07   |Residual|: 2.031e-04   |deigs|: 3.038e-05  t[s]:     64.13
SCF: Cycle:  5   Etot: -0.809254288826702   dEtot: -4.801e-07   |Residual|: 8.739e-05   |deigs|: 1.681e-04  t[s]:     64.82
SCF: Cycle:  6   Etot: -0.809254412207528   dEtot: -1.234e-07   |Residual|: 3.418e-05   |deigs|: 1.642e-05  t[s]:     65.34
SCF: Cycle:  7   Etot: -0.809254453089293   dEtot: -4.088e-08   |Residual|: 2.215e-05   |deigs|: 2.244e-05  t[s]:     65.87
SCF: Cycle:  8   Etot: -0.809254465424490   dEtot: -1.234e-08   |Residual|: 8.048e-06   |deigs|: 9.687e-06  t[s]:     66.40
SCF: Cycle:  9   Etot: -0.809254466487041   dEtot: -1.063e-09   |Residual|: 4.146e-06   |deigs|: 1.976e-06  t[s]:     66.92
SCF: Cycle: 10   Etot: -0.809254466579549   dEtot: -9.251e-11   |Residual|: 1.241e-06   |deigs|: 2.667e-06  t[s]:     67.44
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.728331142603535   0.359455856176099 1
ion H   0.000000000000000   5.171668857396464  -0.259455856176099 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   0.222184286241280  -0.153124433667235 1
force H   0.000000000000000  -0.222184286241280   0.153124433667235 1

# Energy components:
   Eewald =        0.9453400431435268
       EH =        1.2187245811734291
     Eloc =       -3.1179940822066841
      Enl =       -0.0520171626872606
      Exc =       -0.6471188773365688
       KE =        0.8438110313340079
-------------------------------------
     Etot =       -0.8092544665795494

IonicMinimize: Iter:   8  Etot: -0.809254466579549  |grad|_K:  1.558e-01  alpha:  3.608e-01  linmin: -9.311e-01  t[s]:     67.61

#--- Lowdin population analysis ---
# oxidation-state H +0.030 +0.030

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -0.849684756259612   dEtot: -3.265e-03   |Residual|: 1.841e-02   |deigs|: 1.625e-03  t[s]:     68.36
SCF: Cycle:  1   Etot: -0.849707514142348   dEtot: -2.276e-05   |Residual|: 1.016e-02   |deigs|: 2.038e-03  t[s]:     69.00
SCF: Cycle:  2   Etot: -0.849711484941996   dEtot: -3.971e-06   |Residual|: 1.182e-03   |deigs|: 2.183e-03  t[s]:     69.63
SCF: Cycle:  3   Etot: -0.849711722819906   dEtot: -2.379e-07   |Residual|: 5.512e-04   |deigs|: 1.497e-05  t[s]:     70.14
SCF: Cycle:  4   Etot: -0.849712157218256   dEtot: -4.344e-07   |Residual|: 2.084e-04   |deigs|: 5.383e-05  t[s]:     70.81
SCF: Cycle:  5   Etot: -0.849712609818195   dEtot: -4.526e-07   |Residual|: 8.539e-05   |deigs|: 1.885e-04  t[s]:     71.51
SCF: Cycle:  6   Etot: -0.849712749786935   dEtot: -1.400e-07   |Residual|: 4.339e-05   |deigs|: 4.587e-05  t[s]:     72.02
SCF: Cycle:  7   Etot: -0.849712775292667   dEtot: -2.551e-08   |Residual|: 2.243e-05   |deigs|: 1.026e-05  t[s]:     72.59
SCF: Cycle:  8   Etot: -0.849712787628158   dEtot: -1.234e-08   |Residual|: 7.351e-06   |deigs|: 1.187e-05  t[s]:     73.29
SCF: Cycle:  9   Etot: -0.849712788630920   dEtot: -1.003e-09   |Residual|: 4.643e-06   |deigs|: 1.779e-06  t[s]:     73.86
SCF: Cycle: 10   Etot: -0.849712788812276   dEtot: -1.814e-10   |Residual|: 1.178e-06   |deigs|: 1.515e-06  t[s]:     74.41
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.777826062816679   0.272563665082223 1
ion H   0.000000000000000   5.122173937183319  -0.172563665082223 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   0.146676800618126  -0.104907402790955 1
force H   0.000000000000000  -0.146676800618126   0.104907402790955 1

# Energy components:
   Eewald =        0.8888189495596611
       EH =        1.2069387036339960
     Eloc =       -3.0858595311807173
      Enl =       -0.0510423670844790
      Exc =       -0.6417982121255825
       KE =        0.8332296683848461
-------------------------------------
     Etot =       -0.8497127888122755

IonicMinimize: Iter:   9  Etot: -0.849712788812276  |grad|_K:  1.041e-01  alpha:  3.876e-01  linmin: -9.081e-01  t[s]:     74.57

#--- Lowdin population analysis ---
# oxidation-state H +0.029 +0.029

-------- Electronic minimization -----------
Will mix electronic density at each iteration.
SCF: Cycle:  0   Etot: -0.875394136860285   dEtot: -3.498e-03   |Residual|: 2.029e-02   |deigs|: 1.781e-03  t[s]:     75.28
SCF: Cycle:  1   Etot: -0.875435242495731   dEtot: -4.111e-05   |Residual|: 1.098e-02   |deigs|: 2.894e-03  t[s]:     75.89
SCF: Cycle:  2   Etot: -0.875443543004357   dEtot: -8.301e-06   |Residual|: 1.412e-03   |deigs|: 3.026e-03  t[s]:     76.46
SCF: Cycle:  3   Etot: -0.875443883011175   dEtot: -3.400e-07   |Residual|: 5.992e-04   |deigs|: 1.105e-04  t[s]:     76.91
SCF: Cycle:  4   Etot: -0.875444393978924   dEtot: -5.110e-07   |Residual|: 2.276e-04   |deigs|: 1.077e-04  t[s]:     77.50
SCF: Cycle:  5   Etot: -0.875444893418604   dEtot: -4.994e-07   |Residual|: 9.479e-05   |deigs|: 2.319e-04  t[s]:     78.08
SCF: Cycle:  6   Etot: -0.875445052349456   dEtot: -1.589e-07   |Residual|: 7.767e-05   |deigs|: 1.075e-04  t[s]:     78.55
SCF: Cycle:  7   Etot: -0.875445090197089   dEtot: -3.785e-08   |Residual|: 2.272e-05   |deigs|: 1.113e-05  t[s]:     79.05
SCF: Cycle:  8   Etot: -0.875445103509272   dEtot: -1.331e-08   |Residual|: 1.457e-05   |deigs|: 5.976e-06  t[s]:     79.65
SCF: Cycle:  9   Etot: -0.875445104249372   dEtot: -7.401e-10   |Residual|: 7.175e-06   |deigs|: 5.979e-06  t[s]:     80.14
SCF: Cycle: 10   Etot: -0.875445104524071   dEtot: -2.747e-10   |Residual|: 3.093e-06   |deigs|: 3.925e-06  t[s]:     80.79
SCF: Converged (|Delta E|<1.000000e-08 for 2 iters).

Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion H   0.000000000000000   6.825478695498274   0.184647644935080 1
ion H   0.000000000000000   5.074521304501725  -0.084647644935080 1

# Forces in Cartesian coordinates:
force H   0.000000000000000   0.095156536760024  -0.059194411997371 1
force H   0.000000000000000  -0.095156536760024   0.059194411997371 1

# Energy components:
   Eewald =        0.8404400953295992
       EH =        1.1902766590656884
     Eloc =       -3.0407638392298879
      Enl =       -0.0496341595679475
      Exc =       -0.6343177559391191
       KE =        0.8185538958175960
-------------------------------------
     Etot =       -0.8754451045240710

IonicMinimize: Iter:  10  Etot: -0.875445104524071  |grad|_K:  6.470e-02  alpha:  4.786e-01  linmin: -8.690e-01  t[s]:     80.96
IonicMinimize: None of the convergence criteria satisfied after 10 iterations.

#--- Lowdin population analysis ---
# oxidation-state H +0.027 +0.027

Dumping 'H2.ionpos' ... done
End date and time: Wed May  8 21:07:35 2024  (Duration: 0-0:01:20.98)
Done!
cbu@polaris-login-01:~/jdftx/build/test/vibrations>
ColinBundschu commented 4 months ago

cpu_test.zip test results (CPU only) [Note: this is only the first 5 tests]

shankar1729 commented 4 months ago

Try the attached CMakeLists.txt: an initial draft to use the newer cuda as a language feature of cmake. CMakeLists.txt

You can set the environment variable CUDAARCHS to "80" for the A100s.

ColinBundschu commented 4 months ago

Should I leave all the other compiler flags the same?

shankar1729 commented 4 months ago

Yes, try the same settings as before to use the cray wrappers with nvhpc. You may have to tweak some things iteratively if you get errors of course; I can't predict how this is going to behave on that system as it seems to be configured very differently to all the others I've had access to.

ColinBundschu commented 4 months ago

Can you change the CMakeLists.txt to make it so that wannier_gpu and phonon_gpu explicitly link against libcudart? Something like:

target_link_libraries(wannier_gpu ${CUDA_LIBRARIES} ${CUDA_cudart_LIBRARY}) target_link_libraries(phonon_gpu ${CUDA_LIBRARIES} ${CUDA_cudart_LIBRARY})

shankar1729 commented 4 months ago

Sure, you can do that if needed, but that has not been necessary before. Are you encountering a specific error that's leading you to need this?

Also, did the rest of the build work with the new cmakelists.txt leading to a successful test? The CPU failures from yesterday were pointing to a more serious issue with nvhpc compiling this code, unrelated to cuda.

ColinBundschu commented 4 months ago

cmake_results.txt Here are the compilation results. It looks like it worked until the linker step, where it failed due to these libraries not linking properly. I haven't tested it with the changed CMakeLists.txt yet.

shankar1729 commented 4 months ago

Please try the new one (using enable_language(CUDA)) then, because it looks like CMake automatically handles the libraries that need to be linked in for that case. The problem maybe that the old FindCUDA logic is plain incompatible with NVHPC and does not report the appropiate list of libraries to link-in, hence missing things like cudart. We've not had to manually link cudart in other cases.

ColinBundschu commented 4 months ago

Sorry, let me clarify: This is with the new CMakeLists.txt you sent. It is not including the linker changes I proposed.

shankar1729 commented 4 months ago

It looks like jdftx_gpu still compiled fine, which makes it even weirder that wannier_gpu and phonon_gpu did not. While we look at those separately, can you test if this jdftx_gpu works correctly? (All the tests are only with jdftx, not with wannier or phonon.)

ColinBundschu commented 4 months ago

This is the CPU test result of O2 from the first test. Infinite energy:

cbu@x3006c0s19b0n0:~/jdftx/build/test/openShell> cat O2.out

*************** JDFTx 1.7.0 (git hash 19235885) ***************

Start date and time: Thu May  9 18:46:58 2024
Executable /home/cbu/jdftx/build/jdftx with command-line: -i /home/cbu/jdftx/jdftx-git/jdftx/test/openShell/O2.in -d -o O2.out
Running on hosts (process indices):  x3006c0s19b0n0 (0)
Divided in process groups (process indices):  0 (0)
Resource initialization completed at t[s]:      0.00
Run totals: 1 processes, 32 threads, 0 GPUs

Input parsed successfully to the following command list (including defaults):

basis kpoint-dependent
coords-type Cartesian
core-overlap-check vector
coulomb-interaction Isolated
coulomb-truncation-embed 0 0 0
davidson-band-ratio 1.1
dump End None
dump-name $INPUT.$VAR
elec-cutoff 20 100
elec-eigen-algo Davidson
elec-ex-corr gga-PBE
elec-initial-magnetization 2.000000 yes
electronic-minimize  \
        dirUpdateScheme      FletcherReeves \
        linminMethod         DirUpdateRecommended \
        nIterations          100 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  1e-08 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
exchange-regularization None
fluid None
fluid-ex-corr lda-TF lda-PZ
fluid-gummel-loop 10 1.000000e-05
fluid-minimize  \
        dirUpdateScheme      PolakRibiere \
        linminMethod         DirUpdateRecommended \
        nIterations          100 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  0 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
fluid-solvent H2O 55.338 ScalarEOS \
        epsBulk 78.4 \
        pMol 0.92466 \
        epsInf 1.77 \
        Pvap 1.06736e-10 \
        sigmaBulk 4.62e-05 \
        Rvdw 2.61727 \
        Res 1.42 \
        tauNuc 343133 \
        poleEl 15 7 1
forces-output-coords Positions
ion O   0.000000000000000   0.000000000000000   1.140000000000000 1
ion O   0.000000000000000   0.000000000000000  -1.140000000000000 1
ion-species GBRV/$ID_pbe.uspp
ion-width 0
ionic-minimize  \
        dirUpdateScheme      L-BFGS \
        linminMethod         DirUpdateRecommended \
        nIterations          0 \
        history              15 \
        knormThreshold       0.0001 \
        maxThreshold         no \
        energyDiffThreshold  1e-06 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
kpoint   0.000000000000   0.000000000000   0.000000000000  1.00000000000000
kpoint-folding 1 1 1
latt-move-scale 1 1 1
latt-scale 1 1 1
lattice Cubic 13
lattice-minimize  \
        dirUpdateScheme      L-BFGS \
        linminMethod         DirUpdateRecommended \
        nIterations          0 \
        history              15 \
        knormThreshold       0 \
        maxThreshold         no \
        energyDiffThreshold  1e-06 \
        nEnergyDiff          2 \
        alphaTstart          1 \
        alphaTmin            1e-10 \
        updateTestStepSize   yes \
        alphaTreduceFactor   0.1 \
        alphaTincreaseFactor 3 \
        nAlphaAdjustMax      3 \
        wolfeEnergy          0.0001 \
        wolfeGradient        0.9 \
        fdTest               no
lcao-params -1 1e-06 0.001
pcm-variant GLSSA13
perturb-minimize  \
        nIterations            0 \
        algorithm              MINRES \
        residualTol            0.0001 \
        residualDiffThreshold  0.0001 \
        CGBypass               no \
        recomputeResidual      no
spintype z-spin
subspace-rotation-factor 1 yes
symmetries automatic
symmetry-threshold 0.0001

---------- Setting up symmetries ----------

Found 48 point-group symmetries of the bravais lattice
Found 16 space-group symmetries with basis
Applied RMS atom displacement 0 bohrs to make symmetries exact.

---------- Initializing the Grid ----------
R =
[           13            0            0  ]
[            0           13            0  ]
[            0            0           13  ]
unit cell volume = 2197
G =
[   0.483322          0          0  ]
[          0   0.483322          0  ]
[          0          0   0.483322  ]
Minimum fftbox size, Smin = [  60  60  60  ]
Chosen fftbox size, S = [  60  60  60  ]

---------- Initializing tighter grid for wavefunction operations ----------
R =
[           13            0            0  ]
[            0           13            0  ]
[            0            0           13  ]
unit cell volume = 2197
G =
[   0.483322          0          0  ]
[          0   0.483322          0  ]
[          0          0   0.483322  ]
Minimum fftbox size, Smin = [  56  56  56  ]
Chosen fftbox size, S = [  56  56  56  ]

---------- Exchange Correlation functional ----------
Initalized PBE GGA exchange.
Initalized PBE GGA correlation.

---------- Setting up pseudopotentials ----------
Width of ionic core gaussian charges (only for fluid interactions / plotting) set to 0

Reading pseudopotential file '/home/cbu/jdftx/build/pseudopotentials/GBRV/o_pbe.uspp':
  Title: O.  Created by USPP 7.3.6 on 3-2-2014
  Reference state energy: -15.894388.  6 valence electrons in orbitals:
    |200>  occupation: 2  eigenvalue: -0.878823
    |210>  occupation: 4  eigenvalue: -0.332131
  lMax: 2  lLocal: 2  QijEcut: 6
  5 projectors sampled on a log grid with 511 points:
    l: 0  eig: -0.878823  rCut: 1.25
    l: 0  eig: 0.000000  rCut: 1.25
    l: 1  eig: -0.332132  rCut: 1.25
    l: 1  eig: 0.000000  rCut: 1.25
    l: 2  eig: 1.000000  rCut: 1.25
  Partial core density with radius 0.7
  Transforming core density to a uniform radial grid of dG=0.02 with 1261 points.
  Transforming local potential to a uniform radial grid of dG=0.02 with 1261 points.
  Transforming nonlocal projectors to a uniform radial grid of dG=0.02 with 432 points.
  Transforming density augmentations to a uniform radial grid of dG=0.02 with 1261 points.
  Transforming atomic orbitals to a uniform radial grid of dG=0.02 with 432 points.
  Core radius for overlap checks: 1.25 bohrs.

Initialized 1 species with 2 total atoms.

Folded 1 k-points by 1x1x1 to 1 k-points.

---------- Setting up k-points, bands, fillings ----------
No reducable k-points.
Computing the number of bands and number of electrons
Calculating initial fillings.
Turning on subspace rotations due to non-scalar fillings.
nElectrons:  12.000000   nBands: 7   nStates: 2

----- Setting up reduced wavefunction bases (one per k-point) -----
average nbasis = 9435.000 , ideal nbasis = 9385.751

---------- Setting up coulomb interaction ----------
Setting up double-sized grid for truncated Coulomb potentials:
R =
[           26            0            0  ]
[            0           26            0  ]
[            0            0           26  ]
unit cell volume = 17576
G =
[   0.241661          0          0  ]
[          0   0.241661          0  ]
[          0          0   0.241661  ]
Chosen fftbox size, S = [  120  120  120  ]
Integer grid location selected as the embedding center:
   Grid: [  0  0  0  ]
   Lattice: [  0  0  0  ]
   Cartesian: [  0  0  0  ]
Constructing Wigner-Seitz cell: 6 faces (6 quadrilaterals, 0 hexagons)
Range-separation parameter for embedded mesh potentials due to point charges: 0.587227 bohrs.
Constructing Wigner-Seitz cell: 6 faces (6 quadrilaterals, 0 hexagons)
Gaussian width for range separation: 1.3698 bohrs.
FFT grid for long-range part: [ 120 120 120 ].
Planning fourier transform ... Done.
Computing truncated long-range part in real space ... Done.
Adding short-range part in reciprocal space ... Done.

---------- Allocating electronic variables ----------
Initializing wave functions:  linear combination of atomic orbitals
O pseudo-atom occupations:   s ( 2 )  p ( 4 )
        FillingsUpdate:  mu: -0.115710922  nElectrons: 12.000000  magneticMoment: [ Abs: 2.00193  Tot: +2.00000 ]
LCAOMinimize: Iter:   0  Etot: +inf  |grad|_K:  2.808e-03  alpha:  1.000e+00
LCAOMinimize: E=inf. Stopping ...

---- Citations for features of the code used in this run ----

   Software package:
      R. Sundararaman, K. Letchworth-Weaver, K.A. Schwarz, D. Gunceler, Y. Ozhabes and T.A. Arias, 'JDFTx: software for joint density-functional theory', SoftwareX 6, 278 (2017)

   gga-PBE exchange-correlation functional:
      J.P. Perdew, K. Burke and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996)

   Pseudopotentials:
      KF Garrity, JW Bennett, KM Rabe and D Vanderbilt, Comput. Mater. Sci. 81, 446 (2014)

   Truncated Coulomb potentials:
      R. Sundararaman and T.A. Arias, Phys. Rev. B 87, 165122 (2013)

   Total energy minimization:
      T.A. Arias, M.C. Payne and J.D. Joannopoulos, Phys. Rev. Lett. 69, 1077 (1992)

This list may not be complete. Please suggest additional citations or
report any other bugs at https://github.com/shankar1729/jdftx/issues

Initialization completed successfully at t[s]:      7.33

-------- Electronic minimization -----------
ElecMinimize: Iter:   0  Etot: +inf  |grad|_K:  1.377e-03  alpha:  1.000e+00
ElecMinimize: E=inf. Stopping ...
Setting wave functions to eigenvectors of Hamiltonian

# Ionic positions in cartesian coordinates:
ion O   0.000000000000000   0.000000000000000   1.140000000000000 1
ion O   0.000000000000000   0.000000000000000  -1.140000000000000 1

# Forces in Cartesian coordinates:
force O                -nan                -nan                -nan 1
force O                -nan                -nan                -nan 1

# Energy components:
   Eewald =                       inf
       EH =       41.7080609073179573
     Eloc =     -101.1445987842446641
      Enl =        4.4746413624828447
      Exc =       -6.9862374141204366
 Exc_core =        0.1300954080658279
       KE =       14.1646798227908874
-------------------------------------
     Etot =                       inf

IonicMinimize: Iter:   0  Etot: +inf  |grad|_K:       -nan  t[s]:      9.46
IonicMinimize: |grad|_K=-nan. Stopping ...

#--- Lowdin population analysis ---
# oxidation-state O +0.000 +0.000
# magnetic-moments O +1.000 +1.000

End date and time: Thu May  9 18:47:07 2024  (Duration: 0-0:00:09.80)
Done!
cbu@x3006c0s19b0n0:~/jdftx/build/test/openShell>
ColinBundschu commented 4 months ago

The output is the same with the tests run using GPU

gpu_test .txt

ColinBundschu commented 4 months ago

full_gpu_test.zip Full gpu test results

shankar1729 commented 4 months ago

Interesting, the Ewald sum is wrong! Alright, this gives me something to go on. I'll compare these outputs to first ionic iterations of successful tests.

ColinBundschu commented 4 months ago

Sounds good, thanks for the help. Let me know if you need anything from me.

shankar1729 commented 4 months ago

I've narrowed this down to weirdness by the nvidia compiler on float to bool conversions (zero testing) on the cpu code, which affects the gpu build as well because certain small pieces like the Ewald sum are calculated on the cpu.

I don't have access to nvhpc at the moment: could you prototype a separate cpu-only build using the gnu toolchain on your machine, but linking to the same libraries. I want to make sure that this is a cpp-compiler-specific issue (and not a library one) before proceeding.

ColinBundschu commented 4 months ago

You might have to walk me through this one a bit. So basically you want me to load the gnu environment? This would switch out CC and CXX to be the gnu compilers. https://docs.alcf.anl.gov/polaris/compiling-and-linking/gnu-compilers-polaris/

shankar1729 commented 4 months ago

Yes, with those modules swapped out, remove all the cuda related flags on the cnake command line and build. Shouldn't make a difference whether you use the old or new CMakeLists.txt file.

ColinBundschu commented 4 months ago

I do not think this is possible. It does not seem able to load the fortran libraries and the supposed gcc-mixed module does not exist:


cbu@x3205c0s19b0n0:~/jdftx/build> cmake -D GSL_PATH=/home/cbu/gsl \
>       -D FFTW3_PATH=/opt/cray/pe/fftw/3.3.10.6/x86_milan \
>       -D CMAKE_PREFIX_PATH="/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64" \
>       -D CBLAS_PATH=/home/cbu/local/lib \
>       -D LAPACK_LIBRARIES="/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/compilers/lib/liblapack.so;/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/compilers/lib/libblas.so" \
>       -D CMAKE_EXE_LINKER_FLAGS="-L/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/compilers/lib -lnvfortran -lpgf90 -lpgf90rtl -lpgftnrtl" \
>       ../jdftx-git/jdftx
-- The C compiler identification is GNU 12.3.0
-- The CXX compiler identification is GNU 12.3.0
-- Cray Programming Environment 2.7.30 C
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - failed
-- Check for working C compiler: /opt/cray/pe/craype/2.7.30/bin/cc
-- Check for working C compiler: /opt/cray/pe/craype/2.7.30/bin/cc - broken
CMake Error at /home/cbu/software/cmake/share/cmake-3.29/Modules/CMakeTestCCompiler.cmake:67 (message):
  The C compiler

    "/opt/cray/pe/craype/2.7.30/bin/cc"

  is not able to compile a simple test program.

  It fails with the following output:

    Change Dir: '/home/cbu/jdftx/build/CMakeFiles/CMakeScratch/TryCompile-AyZLS5'

    Run Build Command(s): /home/cbu/software/cmake/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile cmTC_f716e/fast
    /usr/bin/gmake  -f CMakeFiles/cmTC_f716e.dir/build.make CMakeFiles/cmTC_f716e.dir/build
    gmake[1]: Entering directory '/home/cbu/jdftx/build/CMakeFiles/CMakeScratch/TryCompile-AyZLS5'
    Building C object CMakeFiles/cmTC_f716e.dir/testCCompiler.c.o
    /opt/cray/pe/craype/2.7.30/bin/cc    -o CMakeFiles/cmTC_f716e.dir/testCCompiler.c.o -c /home/cbu/jdftx/build/CMakeFiles/CMakeScratch/TryCompile-AyZLS5/testCCompiler.c
    Linking C executable cmTC_f716e
    /home/cbu/software/cmake/bin/cmake -E cmake_link_script CMakeFiles/cmTC_f716e.dir/link.txt --verbose=1
    /opt/cray/pe/craype/2.7.30/bin/cc -L/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/compilers/lib -lnvfortran -lpgf90 -lpgf90rtl -lpgftnrtl  CMakeFiles/cmTC_f716e.dir/testCCompiler.c.o -o cmTC_f716e
    /usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: cannot find -lnvfortran: No such file or directory
    /usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: cannot find -lpgf90: No such file or directory
    /usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: cannot find -lpgf90rtl: No such file or directory
    /usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: cannot find -lpgftnrtl: No such file or directory
    collect2: error: ld returned 1 exit status
    gmake[1]: *** [CMakeFiles/cmTC_f716e.dir/build.make:99: cmTC_f716e] Error 1
    gmake[1]: Leaving directory '/home/cbu/jdftx/build/CMakeFiles/CMakeScratch/TryCompile-AyZLS5'
    gmake: *** [Makefile:127: cmTC_f716e/fast] Error 2

  CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
  CMakeLists.txt:3 (project)

-- Configuring incomplete, errors occurred!
cbu@x3205c0s19b0n0:~/jdftx/build>
shankar1729 commented 4 months ago

It looks like the link flags you are specifying is creating the problem. If you're still using the cray compiler wrappers with the gnu compiler, you can try setting -D EnableLibSci=yes to take care of blas, lapack etc. and remove most of your custom flags.

ColinBundschu commented 4 months ago

This flag does not solve the issue, and I am uncertain how to proceed. I am getting a lot of issues with finding the cblas and lpack libraries, and I have tried multiple ways of specifying their locations in the CMakeLists.txt

[ 94%] Building CXX object CMakeFiles/wannier.dir/wannier/WannierMinimizer_defect.cpp.o
[ 95%] Building CXX object CMakeFiles/wannier.dir/wannier/WannierMinimizer_init.cpp.o
[ 95%] Building CXX object CMakeFiles/phonon.dir/phonon/main.cpp.o
[ 95%] Building CXX object CMakeFiles/wannier.dir/wannier/WannierMinimizer_phonon.cpp.o
[ 96%] Building CXX object CMakeFiles/wannier.dir/wannier/WannierMinimizer_save.cpp.o
[ 96%] Building CXX object CMakeFiles/wannier.dir/wannier/commands.cpp.o
[ 97%] Building CXX object CMakeFiles/wannier.dir/wannier/main.cpp.o
[ 98%] Linking CXX executable phonon
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zgetri_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zher2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_strsv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zgesdd_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_zdotc_sub'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dtrsm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zhemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sgemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_strmm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_csymm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dgemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zgetrf_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_drotg'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zgeev_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cdotc_sub'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sger'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssymm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zherk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_caxpy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_scasum'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_zscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_srotg'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_strmv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dgemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cgerc'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zpotrf_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_saxpy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cgemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsdot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cherk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_isamax'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssymv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zher2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dtrsv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sswap'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_ddot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cher2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_dnrm2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cgemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zgesvd_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dasum'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zgemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ctrmv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_csscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cgeru'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cswap'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_chemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `ztrtri_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ctrsv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_zgemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ctrmm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_icamax'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_csyrk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_scnrm2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ctrsm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsyr2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_drotmg'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sgemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsyr'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_srot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zpotrs_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsyrk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dzasum'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_idamax'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssyr2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_dznrm2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_csyr2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zhemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_chemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_drotm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsyr2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsymm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zsyrk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_srotm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_dscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sdsdot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cdotu_sub'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zswap'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zgerc'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssyr2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cher'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dswap'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zcopy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zsymm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ztrmv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_scopy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_drot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssyrk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ztrsm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zsyr2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cher2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sasum'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ztrsv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_zaxpy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zher'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dtbsv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ccopy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_daxpy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ztrmm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dger'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_srotmg'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssyr'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dtrmv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zheevr_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_strsm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_izamax'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsymv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zgeru'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_zdscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zdotu_sub'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dcopy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sdot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_snrm2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dtrmm'
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/phonon.dir/build.make:165: phonon] Error 1
make[1]: *** [CMakeFiles/Makefile2:234: CMakeFiles/phonon.dir/all] Error 2
[100%] Linking CXX executable wannier
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zgetri_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zher2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_strsv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zgesdd_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_zdotc_sub'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dtrsm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zhemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sgemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_strmm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_csymm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dgemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zgetrf_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_drotg'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zgeev_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cdotc_sub'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sger'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssymm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zherk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_caxpy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_scasum'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_zscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_srotg'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_strmv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dgemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cgerc'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zpotrf_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_saxpy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cgemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsdot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cherk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_isamax'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssymv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zher2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dtrsv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sswap'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_ddot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cher2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_dnrm2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cgemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zgesvd_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dasum'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zgemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ctrmv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_csscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cgeru'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cswap'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_chemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `ztrtri_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ctrsv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_zgemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ctrmm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_icamax'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_csyrk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_scnrm2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ctrsm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsyr2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_drotmg'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sgemm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsyr'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_srot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zpotrs_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsyrk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dzasum'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_idamax'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssyr2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_dznrm2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_csyr2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zhemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_chemv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_drotm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsyr2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsymm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zsyrk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_srotm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_dscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sdsdot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cdotu_sub'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zswap'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zgerc'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssyr2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cher'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dswap'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zcopy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zsymm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ztrmv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_scopy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_drot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssyrk'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ztrsm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zsyr2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cher2k'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sasum'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ztrsv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_zaxpy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zher'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dtbsv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ccopy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_daxpy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ztrmm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dger'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_cscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_srotmg'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_ssyr'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dtrmv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `zheevr_'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_strsm'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_izamax'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dsymv'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zgeru'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: libjdftx.so: undefined reference to `cblas_zdscal'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_zdotu_sub'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dcopy'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_sdot'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_snrm2'
/usr/lib64/gcc/x86_64-suse-linux/12/../../../../x86_64-suse-linux/bin/ld: /home/cbu/gsl/lib/libgsl.so: undefined reference to `cblas_dtrmm'
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/wannier.dir/build.make:261: wannier] Error 1
make[1]: *** [CMakeFiles/Makefile2:208: CMakeFiles/wannier.dir/all] Error 2
make: *** [Makefile:146: all] Error
ColinBundschu commented 4 months ago

Is it strictly necessary to try this with the gnu compilers on Polaris?

shankar1729 commented 4 months ago

I pushed a few commits to jdftx github to work around the nvhpc behavior (compiler bugs?). I was able to reproduce the Ewald issue using an nvhpc build on NERSC Perlmutter, and then fix it with these changes.

Please pull the latest git version and build again in a clean directory. The new cmakelists.txt is now on git. I've posted my cmake settings on perlmutter for reference, but you'll need to modify the lapack/blas/gsl stuff as you had previously of course. Let me know if that works.

CC=cc CXX=CC CUDAARCHS=80 cmake \
        -D EnableProfiling=yes \
        -D EnableHDF5=yes \
        -D EnableLibXC=yes \
        -D EnableLibSci=yes \
        -D EnableScaLAPACK=yes \
        -D FFTW3_PATH=${FFTW_ROOT} \
        -D GSL_PATH=${GSL_ROOT} \
        -D EnableCUDA=yes \
        -D EnableCuSolver=yes \
        -D CudaAwareMPI=yes \
        -D PinnedHostMemory=yes \
        -D CMAKE_LIBRARY_PATH="${LD_LIBRARY_PATH//:/;}" \
        -D CMAKE_CXX_FLAGS="-Wl,--no-warn-execstack --diag_suppress=unsigned_compare_with_zero" \
        ../jdftx-git/jdftx
ColinBundschu commented 4 months ago

test.zip CPU test with the new updates:

cbu@x3001c0s7b0n0:~/jdftx/build> make test
Running tests...
Test project /home/cbu/jdftx/build
      Start  1: openShell
 1/10 Test  #1: openShell ........................   Passed   68.36 sec
      Start  2: vibrations
 2/10 Test  #2: vibrations .......................   Passed   55.05 sec
      Start  3: moleculeSolvation
 3/10 Test  #3: moleculeSolvation ................***Failed   39.58 sec
      Start  4: ionSolvation
 4/10 Test  #4: ionSolvation .....................***Failed    5.63 sec
      Start  5: latticeOpt
 5/10 Test  #5: latticeOpt .......................   Passed  519.95 sec
      Start  6: metalBulk
 6/10 Test  #6: metalBulk ........................   Passed  454.41 sec
      Start  7: plusU
 7/10 Test  #7: plusU ............................***Failed  128.51 sec
      Start  8: spinOrbit
 8/10 Test  #8: spinOrbit ........................   Passed  315.14 sec
      Start  9: graphene
 9/10 Test  #9: graphene .........................***Failed   51.48 sec
      Start 10: metalSurface
10/10 Test #10: metalSurface .....................***Failed  105.51 sec

50% tests passed, 5 tests failed out of 10

Total Test time (real) = 1743.68 sec

The following tests FAILED:
          3 - moleculeSolvation (Failed)
          4 - ionSolvation (Failed)
          7 - plusU (Failed)
          9 - graphene (Failed)
         10 - metalSurface (Failed)
Errors while running CTest
Output from these tests are in: /home/cbu/jdftx/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
make: *** [Makefile:71: test] Error 8
cbu@x3001c0s7b0n0:~/jdftx/build>
ColinBundschu commented 4 months ago

gpu_test.zip GPU test results. Notice that test 7 passed here, whereas it did not with the CPU test.

cbu@x3001c0s7b0n0:~/jdftx/build> make testclean
Built target testclean
cbu@x3001c0s7b0n0:~/jdftx/build> export JDFTX_LAUNCH=""
cbu@x3001c0s7b0n0:~/jdftx/build> export JDFTX_SUFFIX="_gpu"
cbu@x3001c0s7b0n0:~/jdftx/build> make test
Running tests...
Test project /home/cbu/jdftx/build
      Start  1: openShell
 1/10 Test  #1: openShell ........................   Passed   10.15 sec
      Start  2: vibrations
 2/10 Test  #2: vibrations .......................   Passed    8.45 sec
      Start  3: moleculeSolvation
 3/10 Test  #3: moleculeSolvation ................***Failed    7.98 sec
      Start  4: ionSolvation
 4/10 Test  #4: ionSolvation .....................***Failed    2.81 sec
      Start  5: latticeOpt
 5/10 Test  #5: latticeOpt .......................   Passed   23.54 sec
      Start  6: metalBulk
 6/10 Test  #6: metalBulk ........................   Passed   25.27 sec
      Start  7: plusU
 7/10 Test  #7: plusU ............................   Passed   11.70 sec
      Start  8: spinOrbit
 8/10 Test  #8: spinOrbit ........................   Passed   20.48 sec
      Start  9: graphene
 9/10 Test  #9: graphene .........................***Failed    7.97 sec
      Start 10: metalSurface
10/10 Test #10: metalSurface .....................***Failed    5.14 sec

60% tests passed, 4 tests failed out of 10

Total Test time (real) = 123.51 sec

The following tests FAILED:
          3 - moleculeSolvation (Failed)
          4 - ionSolvation (Failed)
          9 - graphene (Failed)
         10 - metalSurface (Failed)
Errors while running CTest
Output from these tests are in: /home/cbu/jdftx/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
make: *** [Makefile:71: test] Error 8
cbu@x3001c0s7b0n0:~/jdftx/build> =>> PBS: job killed: walltime 3628 exceeded limit 3600
logout

qsub: job 1934455.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov completed
cbu@polaris-login-04:~>
ColinBundschu commented 4 months ago

These are the flags I used to run it, as well as what I needed to do to set up the environment for compiling and linking.

# Set up cblas
mkdir -p ~/local/lib
ln -s /usr/lib64/libcblas.so.3 ~/local/lib/libcblas.so

# Set up GSL
wget https://ftp.gnu.org/gnu/gsl/gsl-latest.tar.gz
tar -xzf gsl-latest.tar.gz
cd gsl-*/
./configure --prefix=$HOME/gsl
make
make install

#cmake
module use /soft/modulefiles
module load spack-pe-base cmake

#compilers
CC=cc
CXX=CC

# run from jdftx/build after "rm -rf *"
cmake\
 -D CBLAS_PATH=/home/cbu/local/lib\
 -D GSL_PATH=/home/cbu/gsl\
 -D FFTW3_PATH=/opt/cray/pe/fftw/3.3.10.6/x86_milan\
 -D CMAKE_PREFIX_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/23.9/math_libs/12.2/lib64\
 -D CMAKE_CUDA_ARCHITECTURES="80"\
 -D EnableProfiling=yes\
 -D EnableCUDA=yes\
 -D EnableCuSolver=yes\
 -D CudaAwareMPI=yes\
 -D PinnedHostMemory=yes\
 -D CMAKE_LIBRARY_PATH="${LD_LIBRARY_PATH//:/;}"\
 -D CMAKE_CXX_FLAGS="-Wl,--no-warn-execstack --diag_suppress=unsigned_compare_with_zero"\
 ../jdftx-git/jdftx
shankar1729 commented 4 months ago

There might be other similar bugs related to the nvhpc compiler causing it: I can look into this on Monday

For reference, the bug I caught before is that implicit copy constructors generated by nvhpc are not copying fixed size multidimensional array members, while the ones generated by g++ are. Not sure if this is a compiler bug, or a GNU extension beyond the C++ standard that JDFTx was inadvertently relying on.

shankar1729 commented 4 months ago

In the meantime, try large vacuum calculations on the GPU, including with MPI and make sure those are working. This will likely require running of slurm flags specific to your cluster, but would otherwise work. Such calculations are working fine and with full performance on Perlmutter with nvhpc

shankar1729 commented 4 months ago

I pushed a few more commits, which made all the tests pass in my perlmutter nvhpc build. Please update and confirm that things work as expected now.

Turns out optimization levels other than the default are leading to nvhpc producing garbage code in some seemingly innocuous cases.

ColinBundschu commented 4 months ago

test.zip Still failing some tests (e.g. 3 and 4). Here are CPU results. I did a totally clean install of everything (including a fresh git clone of jdftx) in my home dir just to be safe.

shankar1729 commented 4 months ago

What about the GPU tests? Could be that there's something wrong in the cpu blas/lapack that nvhpc is linking to. For getting good CPU performance, you'd anyway not want to use nvhpc and switch to the GNU toolchain. Also, when you have A100s, no point in even looking at the CPU.

ColinBundschu commented 4 months ago

I'll check that now. I just assumed it make more sense to test CPU first.

shankar1729 commented 4 months ago

One more quick note: set the JDFTX_MEMPOOL_SIZE variable during GPU runs in order to get the maximum performance. I measured all tests completing in ~ 100s on a single A100.

Once these work, run a real-sized system on the GPUs with at least 50 - 100 atoms to really nail down the performance. I fall is good, 30 s - 2 min for the first ionic step and about half that for subsequent steps should be reasonable for that system size.

ColinBundschu commented 4 months ago
cbu@x3006c0s25b0n0:~/jdftx/build> export JDFTX_LAUNCH=""
cbu@x3006c0s25b0n0:~/jdftx/build> export JDFTX_SUFFIX="_gpu"
cbu@x3006c0s25b0n0:~/jdftx/build> make test
Running tests...
Test project /home/cbu/jdftx/build
      Start  1: openShell
 1/10 Test  #1: openShell ........................   Passed    9.50 sec
      Start  2: vibrations
 2/10 Test  #2: vibrations .......................   Passed    9.13 sec
      Start  3: moleculeSolvation
 3/10 Test  #3: moleculeSolvation ................   Passed   18.61 sec
      Start  4: ionSolvation
 4/10 Test  #4: ionSolvation .....................   Passed   13.04 sec
      Start  5: latticeOpt
 5/10 Test  #5: latticeOpt .......................   Passed   27.89 sec
      Start  6: metalBulk
 6/10 Test  #6: metalBulk ........................   Passed   28.41 sec
      Start  7: plusU
 7/10 Test  #7: plusU ............................   Passed   13.74 sec
      Start  8: spinOrbit
 8/10 Test  #8: spinOrbit ........................   Passed   23.70 sec
      Start  9: graphene
 9/10 Test  #9: graphene .........................   Passed    8.79 sec
      Start 10: metalSurface
10/10 Test #10: metalSurface .....................   Passed   30.32 sec

100% tests passed, 0 tests failed out of 10

Total Test time (real) = 183.16 sec
cbu@x3006c0s25b0n0:~/jdftx/build>

Looks like the gpu tests worked! Now just for my own sanity, given that the CPU version seems to be running but producing wrong answers, is there a way for me to be absolutely sure it used the GPU from the output file when running jdftx? I would hope that the failure would be catastrophic (infinite ewald energy), but I'd rather not risk a subtle issue.

ColinBundschu commented 4 months ago

What should I set JDFTX_MEMPOOL_SIZE to? I'll run an ionic minimization on some solvated spinel oxide slabs I have as well as MNC graphene.

shankar1729 commented 4 months ago

The CPU tests in your attachment that failed didn't finish, the program crashed after a few electronic steps.

The first few lines of the log report the resources used, and should have a non-zero number of GPUs. Also, if the executable name reported in the log is jdftx_gpu, it will not work unless it's running on GPUs.

Set JDFTX_MEMPOOL_SIZE to the anticipated memory usage. So for the small test calculations, you can set something like 512 (in MB). That would bring down the time you saw above. For real production runs that are expected to use the full GPU memory, you can set it to >~ 90% of the GPU memory, e.g. 38000 or so for the 40G A100s. However, this adds a small startup cost for allocating a giant memory parcel at startup, so I would recommend setting it to a smaller value, closer to the anticipated memory requirement, for smaller runs. You can figure out how much memory you need by running a prototype calculation with just a few electronic steps, and check the profiling output at the end that reports total memory usage (per process). Make sure you have compiled with EnableProfiling=yes to get this output.