grimme-lab / xtb

Semiempirical Extended Tight-Binding Program Package
https://xtb-docs.readthedocs.io/
GNU Lesser General Public License v3.0
566 stars 141 forks source link

XTB@GPU #723

Open chburger opened 1 year ago

chburger commented 1 year ago

Building xtb with GPU support fails right at the start with meson

Is there a hint? On a related issue. Would it be also possible to provide a statically linked GPU enabled binary of xtb, like it is done for the CPU version?

Main binary: /usr/bin/python3
Build Options: -Dla_backend=netlib -Dgpu=true -Dcusolver=true -Dgpu_arch=80 -Dprefix=/home/xxx/.local
Python system: Linux
The Meson build system
Version: 0.63.3
Source dir: /home/burger/x651/xtb-6.5.1
Build dir: /home/burger/x651/xtb-6.5.1/build_gpu
Build type: native build
Project name: xtb
Project version: 6.5.1
Fortran compiler for the host machine: nvfortran (nvidia_hpc 22.9-0)
Fortran linker for the host machine: nvfortran pgi 22.9-0
-----
Detecting compiler via: nvc --version
compiler returned <Popen: returncode: 0 args: ['nvc', '--version']>
compiler stdout:

nvc 22.9-0 64-bit target on x86-64 Linux -tp zen3
NVIDIA Compilers and Tools
Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

compiler stderr:

meson.build:19:0: ERROR: Value "c11" (of type "string") for combo option "C language standard to use" is not one of the choices. Possible choices are (as string): "none".
~
awvwgk commented 1 year ago

Support for nvfortran has been always rather fragile. It seems that recent additions in the build system are not compatible with this compiler anymore.

nielskm commented 1 year ago

I have struggled with the same issues and managed to work around the meson problems. But the source code doesn't even compile with nvfortran at the moment. It would be very nice to get it fixed if anyone has time to look at it. Unfortunately, I don't have the Fortran expertise to solve it myself.

nielskm commented 1 year ago

@awvwgk Would it be a problem to set up automatic testing of the GPU build?

chburger commented 1 year ago

could you tell how you solved the meson problem?

nielskm commented 1 year ago

Unfortunately I don't recall. I followed some suggestions that I found in other issues related to the GPU build, but I don't have the exact settings that made it work for me.

awvwgk commented 1 year ago

Would it be a problem to set up automatic testing of the GPU build?

Without a GPU machine, yes.

nielskm commented 1 year ago

Would it be a problem to set up automatic testing of the GPU build?

Without a GPU machine, yes.

But you can check that it compiles?

awvwgk commented 1 year ago

Could probably be setup. Depends whether this is a priority for the new project lead, but contributions are of course always welcome.

nielskm commented 1 year ago

could you tell how you solved the meson problem?

@chburger, I believe that I fixed the meson problems (with a clean xtb v6.5.1) using the args: -Dopenmp=false -Dla_backend=openblas -Dgpu=true -Dcusolver=true -Dgpu_arch=75 -Dc_std=none -Dfortran_args=-fopenacc -Ddefault_library=static

pultar commented 1 year ago

I have the same problem as @chburger describes.

could you tell how you solved the meson problem?

@chburger, I believe that I fixed the meson problems (with a clean xtb v6.5.1) using the args: -Dopenmp=false -Dla_backend=openblas -Dgpu=true -Dcusolver=true -Dgpu_arch=75 -Dc_std=none -Dfortran_args=-fopenacc -Ddefault_library=static

On my system (Ubuntu, nvfortran 22.11-0 64-bit target on x86-64 Linux -tp haswell), that command fails as well:

meson setup build_gpu -Dopenmp=false -Dla_backend=openblas -Dgpu=true -Dcusolver=true -Dgpu_arch=75 -Dc_std=none -Dfortran_args=-fopenacc -Ddefault_library=static

Gives me:

nvfortran-Error-Unknown switch: -fopenacc

meson.build:19:0: ERROR: Compiler nvfortran can not compile programs.

Taking out that flag gives me:

FAILED: subprojects/json-fortran-8.2.5/libjsonfortran.a.p/src_json_string_utilities.F90.o subprojects/json-fortran-8.2.5/libjsonfortran.a.p/json_string_utilities.mod
nvfortran -Isubprojects/json-fortran-8.2.5/libjsonfortran.a.p -Isubprojects/json-fortran-8.2.5 -I../subprojects/json-fortran-8.2.5 -I../subprojects/json-fortran-8.2.5/src -Minform=inform -O2 -g -module subprojects/json-fortran-8.2.5/libjsonfortran.a.p -o subprojects/json-fortran-8.2.5/libjsonfortran.a.p/src_json_string_utilities.F90.o -c ../subprojects/json-fortran-8.2.5/src/json_string_utilities.F90
NVFORTRAN-I-0035-Predefined intrinsic digits loses intrinsic property (../subprojects/json-fortran-8.2.5/src/json_string_utilities.F90: 124)
NVFORTRAN-S-0146-Expression must be character type (../subprojects/json-fortran-8.2.5/src/json_string_utilities.F90: 362)
NVFORTRAN-S-0457-Illegal expression in initialization (../subprojects/json-fortran-8.2.5/src/json_string_utilities.F90: 362)
nvfortran-Fatal-/opt/nvidia/hpc_sdk/Linux_x86_64/22.11/compilers/bin/tools/fort1 TERMINATED by signal 11

Any help would be greatly appreciated!

philipturner commented 5 months ago

I assume the bottleneck is solely from the cubic scaling matrix diagonalization for GFN2-xTB? Then all that's really needed for a GPU port is a GPU-accelerated eigensolver.

Also, the GFN-xTB algorithm has been used on a GPU-based supercomputer to simulate 100 million atoms. Just not the standard xTB software package. https://www.sciencedirect.com/science/article/pii/S0167819122000242