pyscf / gpu4pyscf

A plugin to use Nvidia GPU in PySCF package
GNU General Public License v3.0
106 stars 18 forks source link
gpu

GPU plugin for PySCF

arXiv nightly PyPI version

Installation

[!NOTE] The compiled binary packages support compute capability 6.0 and later (Pascal and later, such as Tesla P100, RTX 10 series and later).

Run nvcc --version in your terminal to check the installed CUDA toolkit version. Then, choose the proper package based on your CUDA toolkit version.

Platform Command cutensor (highly recommended)
CUDA 11.x pip3 install gpu4pyscf-cuda11x | pip3 install cutensor-cu11
CUDA 12.x pip3 install gpu4pyscf-cuda12x | pip3 install cutensor-cu12

Compilation

One can compile the package with

git clone https://github.com/pyscf/gpu4pyscf.git
cd gpu4pyscf
cmake -S gpu4pyscf/lib -B build/temp.gpu4pyscf
cmake --build build/temp.gpu4pyscf -j 4
CURRENT_PATH=`pwd`
export PYTHONPATH="${PYTHONPATH}:${CURRENT_PATH}"

Then install cutensor for acceleration

pip3 install cutensor-cu11

The package also provides multiple dockerfiles in dockerfiles. One can use them as references to create the compilation envrionment.

Features

Limitations

Examples

import pyscf
from gpu4pyscf.dft import rks

atom ='''
O       0.0000000000    -0.0000000000     0.1174000000
H      -0.7570000000    -0.0000000000    -0.4696000000
H       0.7570000000     0.0000000000    -0.4696000000
'''

mol = pyscf.M(atom=atom, basis='def2-tzvpp')
mf = rks.RKS(mol, xc='LDA').density_fit()

e_dft = mf.kernel()  # compute total energy
print(f"total energy = {e_dft}")

g = mf.nuc_grad_method()
g_dft = g.kernel()   # compute analytical gradient

h = mf.Hessian()
h_dft = h.kernel()   # compute analytical Hessian

to_gpu is supported since PySCF 2.5.0

import pyscf
from pyscf.dft import rks

atom ='''
O       0.0000000000    -0.0000000000     0.1174000000
H      -0.7570000000    -0.0000000000    -0.4696000000
H       0.7570000000     0.0000000000    -0.4696000000
'''

mol = pyscf.M(atom=atom, basis='def2-tzvpp')
mf = rks.RKS(mol, xc='LDA').density_fit().to_gpu()  # move PySCF object to GPU4PySCF object
e_dft = mf.kernel()  # compute total energy

Find more examples in gpu4pyscf/examples

Benchmarks

Speedup with GPU4PySCF v0.6.0 on A100-80G over Q-Chem 6.1 on 32-cores CPU (Desity fitting, SCF, def2-tzvpp, def2-universal-jkfit, B3LYP, (99,590))

mol natm LDA PBE B3LYP M06 wB97m-v
020_Vitamin_C 20 2.86 6.09 13.11 11.58 17.46
031_Inosine 31 13.14 15.87 16.57 25.89 26.14
033_Bisphenol_A 33 12.31 16.88 16.54 28.45 28.82
037_Mg_Porphin 37 13.85 19.03 20.53 28.31 30.27
042_Penicillin_V 42 10.34 13.35 15.34 22.01 24.2
045_Ochratoxin_A 45 13.34 15.3 19.66 27.08 25.41
052_Cetirizine 52 17.79 17.44 19 24.41 25.87
057_Tamoxifen 57 14.7 16.57 18.4 24.86 25.47
066_Raffinose 66 13.77 14.2 20.47 22.94 25.35
084_Sphingomyelin 84 14.24 12.82 15.96 22.11 24.46
095_Azadirachtin 95 5.58 7.72 24.18 26.84 25.21
113_Taxol 113 5.44 6.81 24.58 29.14 nan

Find more benchmarks in gpu4pyscf/benchmarks