Allow for different DFT grid sizes in qcxms.in [improvement]

tobigithub commented 3 years ago

Hi, for DFT calculations with QCxMS it would be nice to allow for different grid settings in the qcxms.in settings file.

Reason, the DFT grid size controls accuracy, but also speed. So for speedier calculations one might try a coarse DFT grid size first, which is fast, but also leads to larger errors. For more accurate calculations, one could select a finer DFT grid size, which are more accurate, but also significantly slower.

Relevant: https://cen.acs.org/physical-chemistry/computational-chemistry/Density-functional-theory-error-discovered/97/web/2019/07

Ref: Popular Integration Grids Can Result in Large Errors in DFT-Computed Free Energies https://doi.org/10.26434/chemrxiv.8864204.v5

For TurboMole, the settings are set in the control file gridsize 1..6 and gridsize m3..m5 in (QCxMS/src/tm.f90) https://github.com/qcxms/QCxMS/search?q=gridsize+m4

$dft
 functional pbe0
 gridsize m4

with qcxms.in I used the following settings:

tmol
pbe0
def2-SVP
cid
elab 40
maxcoll 3
noesi

For ORCA this would be

! defgrid1
! defgrid2 (default).
! defgrid3

Tobias

tobigithub commented 3 years ago

I did some quick tests just for ground state optimization and for small grid sizes ("1" the SCF fails and of course for larger grid sizes ("6") it takes longer. The TurboMole handbook states:

Possible grids are 1–5 and m3–m5 where grid 1 is coarse (least accurate) and 5 most dense.
We recommend however the use of so-called multiple grids m3–m5: SCF iterations with
grid 1–3, final energy and gradient with grid 3–5. Usually m3 is fine: for large or delicate
systems, try m4. For a reference calculation with a very fine grid and very tight thresholds
use ’reference’ as grid specification instead of ’gridsize xy’.

Quick grid bench

Gridsize 1  

total energy      =   -153.98754302730      
------------------------------------------      
kinetic energy    =    152.75483287462      
potential energy  =   -306.74237590192      
OPTIMIZATION DID NOT CONVERGE WITHIN 51 CYCLES      

Gridsize 6      

total energy      =   -153.98724065135      
------------------------------------------      
kinetic energy    =    152.73745869349      
potential energy  =   -306.72469934484      

         total  cpu-time :  11.17 seconds       
         total wall-time :   0.72 seconds

and for multiple grids m3–m5, so around 50% difference in execution time.

MINIX basis set 21 functions                            
functional wb97x                            
gridsize m3                             
total energy      =   -153.98748253046          
------------------------------------------                                              
kinetic energy    =    152.74197369073        total  cpu-time :   3.76 seconds      
potential energy  =   -306.72945622119        total wall-time :   0.28 seconds

gridsize m4                                 
total energy      =   -153.98721666957                                              
------------------------------------------    total  cpu-time :   4.65 seconds      
kinetic energy    =    152.73531636975        total wall-time :   0.31 seconds      
potential energy  =   -306.72253303932                          

gridsize m5                                 
total energy      =   -153.98723754061                                              
------------------------------------------    total  cpu-time :   6.60 seconds                      
kinetic energy    =    152.73873215133        total wall-time :   0.44 seconds                  
potential energy  =   -306.72596969194

and of course DFT beeing expensive I was not able to run more quick tests, the source code states (src/tm.f90):

! now the setup: hybrid func. b3lyp, grid m4 (savings with m3 are marginal !   
and m3 grid with fermi smearing produces a noisy  gradient!).   
Real savings ! come with rij, extol 2.500, and scfonv 6. CAB 6.10.15 ! gridsize m3 for GGAs - tests June 2016 CAB

I think its still worthwhile to do especially with all the recent improvements in TurboMole and ORCA!

JayTheDog commented 3 years ago

For ORCA the new grid options can be implemented and some version control can be provided. Idea is to use orca4 for versions 4.0 and higher or orca5 for the new version ORCA 5.0. [default: orca5] (this works, not yet uploaded).

For TURBOMOLE, the tm.f90 file reads: ! now the setup: hybrid func. b3lyp, grid m4 (savings with m3 are marginal ! and m3 grid with fermi smearing produces a noisy gradient!). Real savings ! come with rij, extol 2.500, and scfonv 6. CAB 6.10.15 ! gridsize m3 for GGAs - tests June 2016 CAB

i.e. that the gridsizes were checked and it was found that the speed-up does not come from grid sizes. So I'm not sure how useful this implementation is.

JayTheDog commented 3 years ago

Fixed in version 5.1.3 for ORCA. For Turbomole this is not needed.

qcxms / QCxMS

Allow for different DFT grid sizes in qcxms.in [improvement] #18