MRChemSoft / mrchem

MultiResolution Chemistry
GNU Lesser General Public License v3.0
27 stars 21 forks source link

Node based DFT #475

Closed gitpeterwind closed 6 months ago

gitpeterwind commented 9 months ago

The loop over node is set as the outer loop in MRDFT. For large systems using MPI, that removes the memory intensive intermediate Functions (mostly derivatives) , and is also much faster as a by-product. The code is also much simpler (only one extra method in Functional, instead of the 4 subclasses for each case).

Also the rotation of the sad initial_guess was very slow and a bottleneck. A new rotation is implemented and the time went down from 200 s to 3 s!

With all the "node_xc" changes in mrcpp and mrchem, the code is much more user friendly. It runs smoothly with 1000 orbitals on Betzy. No need to make special settings at the start to "save" memory. For even larger systems, the O(N^3) terms (diagonalization of Fock matrix, orthonormalization, localization) become a bottleneck and should be addressed (using ELPA for example).

Test valinomycine (300 orbitals) betzy, 4 nodes (can also run on 1 nodes now): old:

                           Building XC operator
---------------------------------------------------------------------------
 Precision                                  (rel)              1.00000e-05
---------------------------------------------------------------------------
 Compute rho                     46152 nds         1.41 GB        5.37 sec
 Preprocess input               184672 nds         5.64 GB        0.79 sec
 Evaluate functional            230840 nds         7.04 GB       16.43 sec
 Postprocess potential           92336 nds         2.82 GB        1.72 sec
---------------------------------------------------------------------------
                         Wall time: 2.46008e+01 sec

Memory statistics, in GiB: 190.0

new:

                           Building XC operator
---------------------------------------------------------------------------
 Precision                                  (rel)              1.00000e-05
---------------------------------------------------------------------------
 Compute rho                     46152 nds         1.41 GB        5.39 sec
 Make potential                  46168 nds         1.41 GB        3.03 sec
---------------------------------------------------------------------------
                         Wall time: 8.79213e+00 sec

Memory statistics, in GiB: 64.9  (128.6 after only XC upgrade, 81.8 after GenNodes agressive cleaning, the rest because of Bank upgrade and large memory chunks)
codecov[bot] commented 6 months ago

Codecov Report

Attention: 11 lines in your changes are missing coverage. Please review.

Comparison is base (698e74f) 70.54% compared to head (59a0eb2) 68.66%.

Files Patch % Lines
src/mrdft/MRDFT.cpp 58.33% 10 Missing :warning:
src/mrdft/Functional.cpp 98.96% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #475 +/- ## ========================================== - Coverage 70.54% 68.66% -1.89% ========================================== Files 195 194 -1 Lines 15446 15285 -161 ========================================== - Hits 10896 10495 -401 - Misses 4550 4790 +240 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.