shankar1729 / jdftx

JDFTx: software for joint density functional theory
http://jdftx.org
82 stars 54 forks source link

hybrid functionals #158

Closed sergbuto closed 3 years ago

sergbuto commented 3 years ago

I have tried to use hybrid functional PBE0 in JDFTx in TiO2 calculations by taking advantage of the 'elec-ex-corr' command. However, it takes very long time (more than1000 sec per one SCF itiration). Is it expected to be that long? Are there any ways to reduce the calculation times?

My input:

include common.in kpoint-folding 4 4 4 kpoint 0.5 0.5 0.5 1

elec-ex-corr hyb-PBE0

dump End ElecDensity dump-name totalE.$VAR

lcao-params 100 1e-06 0.001

electronic-SCF energyDiffThreshold 1e-10 \ residualThreshold 1e-09 eigDiffThreshold 1e-10 nIterations 300

common.in:

lattice Orthorhombic 8.673 8.673 5.593 ion-species SG15/$ID_ONCV_PBE-1.0.upf elec-cutoff 30

ion O 0.3053 0.3053 0.00 0 ion O -0.3053 -0.3053 0.00 0 ion O 0.8053 0.1947 0.50 0 ion O 0.1947 0.8053 0.50 0 ion Ti 0.00 0.00 0.00 0 ion Ti 0.50 0.50 .50 0

shankar1729 commented 3 years ago

Please also attach a log file: how many cores and what MPI /threads configuration are you running it with?

Regardless, the short answer is yes, it can take that long. You may find the electronic-minimize to be a little faster than SCF for hybrid functionals.

Best, Shankar

sergbuto commented 3 years ago

Yes, I meant to add this information but forgot. I have 8 cores and run using the 'mpirun -n 8' command. The log file is enclosed. The calculations were stopped at some point by Ctrl-C.

JDFTx 1.6.0

Start date and time: Sun Apr 11 14:07:07 2021 Executable /media/sergei/SSD-2TB/build/jdftx with command-line: -i totalE.in -o totalE.out Running on hosts (process indices): phy-serg (0-7) Divided in process groups (process indices): 0 (0) 1 (1) 2 (2) 3 (3) 4 (4) 5 (5) 6 (6) 7 (7) Resource initialization completed at t[s]: 0.00 Run totals: 8 processes, 8 threads, 0 GPUs

Input parsed successfully to the following command list (including defaults):

basis kpoint-dependent coords-type Lattice core-overlap-check vector coulomb-interaction Periodic davidson-band-ratio 1.1 dump End ElecDensity dump-name totalE.$VAR elec-cutoff 30 elec-eigen-algo Davidson elec-ex-corr hyb-PBE0 electronic-minimize \ dirUpdateScheme FletcherReeves \ linminMethod DirUpdateRecommended \ nIterations 100 \ history 15 \ knormThreshold 0 \ energyDiffThreshold 1e-08 \ nEnergyDiff 2 \ alphaTstart 1 \ alphaTmin 1e-10 \ updateTestStepSize yes \ alphaTreduceFactor 0.1 \ alphaTincreaseFactor 3 \ nAlphaAdjustMax 3 \ wolfeEnergy 0.0001 \ wolfeGradient 0.9 \ fdTest no electronic-scf \ nIterations 300 \ energyDiffThreshold 1e-10 \ residualThreshold 1e-09 \ mixFraction 0.5 \ qMetric 0.8 \ history 10 \ nEigSteps 2 \ eigDiffThreshold 1e-10 \ mixedVariable Density \ qKerker 0.8 \ qKappa -1 \ verbose no \ mixFractionMag 1.5 exchange-regularization WignerSeitzTruncated fluid None fluid-ex-corr (null) lda-PZ fluid-gummel-loop 10 1.000000e-05 fluid-minimize \ dirUpdateScheme PolakRibiere \ linminMethod DirUpdateRecommended \ nIterations 100 \ history 15 \ knormThreshold 0 \ energyDiffThreshold 0 \ nEnergyDiff 2 \ alphaTstart 1 \ alphaTmin 1e-10 \ updateTestStepSize yes \ alphaTreduceFactor 0.1 \ alphaTincreaseFactor 3 \ nAlphaAdjustMax 3 \ wolfeEnergy 0.0001 \ wolfeGradient 0.9 \ fdTest no fluid-solvent H2O 55.338 ScalarEOS \ epsBulk 78.4 \ pMol 0.92466 \ epsInf 1.77 \ Pvap 1.06736e-10 \ sigmaBulk 4.62e-05 \ Rvdw 2.61727 \ Res 1.42 \ tauNuc 343133 \ poleEl 15 7 1 forces-output-coords Positions ion O 0.305300000000000 0.305300000000000 0.000000000000000 0 ion O -0.305300000000000 -0.305300000000000 0.000000000000000 0 ion O 0.805300000000000 0.194700000000000 0.500000000000000 0 ion O 0.194700000000000 0.805300000000000 0.500000000000000 0 ion Ti 0.000000000000000 0.000000000000000 0.000000000000000 0 ion Ti 0.500000000000000 0.500000000000000 0.500000000000000 0 ion-species SG15/$ID_ONCV_PBE-1.0.upf ion-width 0 ionic-minimize \ dirUpdateScheme L-BFGS \ linminMethod DirUpdateRecommended \ nIterations 0 \ history 15 \ knormThreshold 0.0001 \ energyDiffThreshold 1e-06 \ nEnergyDiff 2 \ alphaTstart 1 \ alphaTmin 1e-10 \ updateTestStepSize yes \ alphaTreduceFactor 0.1 \ alphaTincreaseFactor 3 \ nAlphaAdjustMax 3 \ wolfeEnergy 0.0001 \ wolfeGradient 0.9 \ fdTest no kpoint 0.500000000000 0.500000000000 0.500000000000 1.00000000000000 kpoint-folding 4 4 4 latt-move-scale 1 1 1 latt-scale 1 1 1 lattice Orthorhombic 8.673 8.673 5.593 lattice-minimize \ dirUpdateScheme L-BFGS \ linminMethod DirUpdateRecommended \ nIterations 0 \ history 15 \ knormThreshold 0 \ energyDiffThreshold 1e-06 \ nEnergyDiff 2 \ alphaTstart 1 \ alphaTmin 1e-10 \ updateTestStepSize yes \ alphaTreduceFactor 0.1 \ alphaTincreaseFactor 3 \ nAlphaAdjustMax 3 \ wolfeEnergy 0.0001 \ wolfeGradient 0.9 \ fdTest no lcao-params 100 1e-06 0.001 pcm-variant GLSSA13 spintype no-spin subspace-rotation-factor 1 yes symmetries automatic symmetry-threshold 0.0001

---------- Setting up symmetries ----------

Found 16 point-group symmetries of the bravais lattice Found 16 space-group symmetries with basis Applied RMS atom displacement 1.38982e-16 bohrs to make symmetries exact.

---------- Initializing the Grid ---------- R = [ 8.673 0 0 ] [ 0 8.673 0 ] [ 0 0 5.593 ] unit cell volume = 420.711 G = [ 0.724454 0 0 ] [ 0 0.724454 0 ] [ 0 0 1.1234 ] Minimum fftbox size, Smin = [ 44 44 28 ] Chosen fftbox size, S = [ 48 48 28 ]

---------- Exchange Correlation functional ---------- Initalized PBE GGA exchange. Initalized PBE GGA correlation. Will include 0.25 x exact exchange.

---------- Setting up pseudopotentials ---------- Width of ionic core gaussian charges (only for fluid interactions / plotting) set to 0

Reading pseudopotential file '/media/sergei/SSD-2TB/build/pseudopotentials/SG15/O_ONCV_PBE-1.0.upf': 'O' pseudopotential, 'PBE' functional Generated using ONCVPSP code by D. R. Hamann Author: Martin Schlipf and Francois Gygi Date: 150915. 6 valence electrons, 2 orbitals, 4 projectors, 936 radial grid points, with lMax = 1 Transforming local potential to a uniform radial grid of dG=0.02 with 1465 points. Transforming nonlocal projectors to a uniform radial grid of dG=0.02 with 528 points. 2S l: 0 occupation: 2.0 eigenvalue: -0.880572 2P l: 1 occupation: 4.0 eigenvalue: -0.331869 Transforming atomic orbitals to a uniform radial grid of dG=0.02 with 528 points. Core radius for overlap checks: 1.29 bohrs. Reading pulay file /media/sergei/SSD-2TB/build/pseudopotentials/SG15/O_ONCV_PBE-1.0.pulay ... using dE_dnG = -8.209514e-03 computed for Ecut = 30.

Reading pseudopotential file '/media/sergei/SSD-2TB/build/pseudopotentials/SG15/Ti_ONCV_PBE-1.0.upf': 'Ti' pseudopotential, 'PBE' functional Generated using ONCVPSP code by D. R. Hamann Author: Martin Schlipf and Francois Gygi Date: 150915. 12 valence electrons, 4 orbitals, 6 projectors, 1178 radial grid points, with lMax = 2 Transforming local potential to a uniform radial grid of dG=0.02 with 1465 points. Transforming nonlocal projectors to a uniform radial grid of dG=0.02 with 528 points. 3S l: 0 occupation: 2.0 eigenvalue: -2.301740 3P l: 1 occupation: 6.0 eigenvalue: -1.428110 4S l: 0 occupation: 2.0 eigenvalue: -0.164159 3D l: 2 occupation: 2.0 eigenvalue: -0.156511 Transforming atomic orbitals to a uniform radial grid of dG=0.02 with 528 points. WARNING: large normalization error in atomic orbital 1s (integral: 0.000000). Core radius for overlap checks: 2.07 bohrs. Reading pulay file /media/sergei/SSD-2TB/build/pseudopotentials/SG15/Ti_ONCV_PBE-1.0.pulay ... using dE_dnG = -1.641951e-05 computed for Ecut = 30.

Initialized 2 species with 6 total atoms.

Folded 1 k-points by 4x4x4 to 64 k-points.

---------- Setting up k-points, bands, fillings ---------- Reduced to 6 k-points under symmetry. Computing the number of bands and number of electrons Calculating initial fillings. nElectrons: 48.000000 nBands: 24 nStates: 6

----- Setting up reduced wavefunction bases (one per k-point) ----- average nbasis = 3300.750 , ideal nbasis = 3301.866

----- Initializing Supercell corresponding to k-point mesh ----- Lattice vector linear combinations in supercell: [ 4 0 0 ] [ 0 4 0 ] [ 0 0 4 ] Supercell lattice vectors: [ 34.692 0 0 ] [ 0 34.692 0 ] [ 0 0 22.372 ]

-------- Setting up exchange kernel -------- Creating Wigner-Seitz truncated kernel on k-point supercell with sample count [ 200 200 120 ] Constructing Wigner-Seitz cell: 6 faces (6 quadrilaterals, 0 hexagons) Gaussian width for range separation: 1.17866 bohrs. FFT grid for long-range part: [ 200 200 120 ]. Planning fourier transform ... Done. Computing truncated long-range part in real space ... Done. Adding short-range part in reciprocal space ... Done. Splitting supercell kernel to unit-cell with k-points ... Done.

---------- Setting up exact exchange ---------- Optimizing transforms to minimize k-point pairs ... done (320 steps). Reduced 4096 k-pairs to 964 under symmetries. Per-iteration cost relative to semi-local calculation ~ 1000

---------- Setting up ewald sum ---------- Optimum gaussian width for ewald sums = 2.217580 bohr. Real space sum over 891 unit cells with max indices [ 4 4 5 ] Reciprocal space sum over 2475 terms with max indices [ 7 7 5 ]

---------- Allocating electronic variables ---------- Initializing wave functions: linear combination of atomic orbitals Initializing semi-local functional for LCAO: Initalized PBE GGA exchange. Initalized PBE GGA correlation. O pseudo-atom occupations: s ( 2 ) p ( 4 ) Ti pseudo-atom occupations: s ( 2 2 ) p ( 6 ) d ( 2 ) FillingsUpdate: mu: +0.897973172 nElectrons: 48.000000 LCAOMinimize: Iter: 0 Etot: -160.2241745621481641 |grad|_K: 6.845e-02 alpha: 1.000e+00 FillingsUpdate: mu: +0.678899919 nElectrons: 48.000000 LCAOMinimize: Iter: 1 Etot: -166.4405485141112138 |grad|_K: 2.349e-02 alpha: 5.664e-01 linmin: 1.101e-01 cgtest: -2.554e+00 t[s]: 1.39 LCAOMinimize: Bad step direction: g.d > 0. LCAOMinimize: Undoing step. LCAOMinimize: Step failed: resetting search direction. FillingsUpdate: mu: +0.678899919 nElectrons: 48.000000 LCAOMinimize: Iter: 2 Etot: -166.4405485141112138 |grad|_K: 2.349e-02 alpha: 0.000e+00 FillingsUpdate: mu: +0.643743374 nElectrons: 48.000000 LCAOMinimize: Iter: 3 Etot: -167.5471500868069938 |grad|_K: 3.788e-02 alpha: 1.374e-01 linmin: -9.440e-02 cgtest: 5.039e-01 t[s]: 1.66 FillingsUpdate: mu: +0.466038339 nElectrons: 48.000000 LCAOMinimize: Iter: 4 Etot: -170.9546032833266054 |grad|_K: 2.759e-02 alpha: 8.831e-02 linmin: -2.710e-01 cgtest: 5.581e-01 t[s]: 1.82 LCAOMinimize: Predicted alpha/alphaT>3.000000, increasing alphaT to 2.649318e-01. FillingsUpdate: mu: +0.409452933 nElectrons: 48.000000 LCAOMinimize: Iter: 5 Etot: -171.5529167971359641 |grad|_K: 2.087e-02 alpha: 4.932e-02 linmin: -2.607e-01 cgtest: 8.264e-01 t[s]: 2.03 LCAOMinimize: Encountered beta<0, resetting CG. LCAOMinimize: Predicted alpha/alphaT>3.000000, increasing alphaT to 1.479663e-01. LCAOMinimize: Step increased Etot by 1.333021e-01, reducing alpha to 2.962671e-02. FillingsUpdate: mu: +0.395639775 nElectrons: 48.000000 LCAOMinimize: Iter: 6 Etot: -171.6557587653718713 |grad|_K: 1.877e-02 alpha: 2.963e-02 linmin: -2.624e-01 cgtest: 9.606e-01 t[s]: 2.34 LCAOMinimize: Encountered beta<0, resetting CG. LCAOMinimize: Predicted alpha/alphaT>3.000000, increasing alphaT to 8.888013e-02. LCAOMinimize: Predicted alpha/alphaT>3.000000, increasing alphaT to 2.666404e-01. FillingsUpdate: mu: +0.350672906 nElectrons: 48.000000 LCAOMinimize: Iter: 7 Etot: -171.9578375073095629 |grad|_K: 4.495e-03 alpha: 1.452e-01 linmin: -3.485e-01 cgtest: 8.222e-01 t[s]: 2.62 LCAOMinimize: Encountered beta<0, resetting CG. FillingsUpdate: mu: +0.364851357 nElectrons: 48.000000 LCAOMinimize: Iter: 8 Etot: -171.9705105397992213 |grad|_K: 3.873e-04 alpha: 1.537e-01 linmin: 2.999e-01 cgtest: -6.978e-01 t[s]: 2.78 LCAOMinimize: Predicted alpha/alphaT>3.000000, increasing alphaT to 4.611959e-01. FillingsUpdate: mu: +0.365285736 nElectrons: 48.000000 LCAOMinimize: Iter: 9 Etot: -171.9706442853459123 |grad|_K: 3.861e-04 alpha: 5.735e-01 linmin: 8.253e-04 cgtest: 8.149e-01 t[s]: 2.99 FillingsUpdate: mu: +0.363343450 nElectrons: 48.000000 LCAOMinimize: Iter: 10 Etot: -171.9707307469007276 |grad|_K: 5.427e-05 alpha: 1.460e-01 linmin: 2.669e-02 cgtest: -2.190e-01 t[s]: 3.15 LCAOMinimize: Predicted alpha/alphaT>3.000000, increasing alphaT to 4.380909e-01. FillingsUpdate: mu: +0.363248276 nElectrons: 48.000000 LCAOMinimize: Iter: 11 Etot: -171.9707396381396904 |grad|_K: 1.725e-05 alpha: 7.857e-01 linmin: -1.404e-03 cgtest: 2.557e-01 t[s]: 3.37 FillingsUpdate: mu: +0.363348348 nElectrons: 48.000000 LCAOMinimize: Iter: 12 Etot: -171.9707398922980701 |grad|_K: 9.916e-06 alpha: 2.074e-01 linmin: -4.245e-04 cgtest: -1.265e-05 t[s]: 3.53 FillingsUpdate: mu: +0.363334006 nElectrons: 48.000000 LCAOMinimize: Iter: 13 Etot: -171.9707400577940177 |grad|_K: 1.727e-06 alpha: 4.094e-01 linmin: 1.947e-04 cgtest: -7.711e-03 t[s]: 3.69 LCAOMinimize: Converged (|Delta Etot|<1.000000e-06 for 2 iters).

---- Citations for features of the code used in this run ----

Software package: R. Sundararaman, K. Letchworth-Weaver, K.A. Schwarz, D. Gunceler, Y. Ozhabes and T.A. Arias, 'JDFTx: software for joint density-functional theory', SoftwareX 6, 278 (2017)

hyb-PBE0 exchange-correlation functional: M. Ernzerhof and G. E. Scuseria, J. Chem. Phys. 110, 5029 (1999)

Pseudopotentials: M Schlipf and F Gygi, Comput. Phys. Commun. 196, 36 (2015)

Wigner-Seitz truncated method for exact exchange: R. Sundararaman and T.A. Arias, Phys. Rev. B 87, 165122 (2013)

gga-PBE exchange-correlation functional: J.P. Perdew, K. Burke and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996)

This list may not be complete. Please suggest additional citations or report any other bugs at https://github.com/shankar1729/jdftx/issues

Initialization completed successfully at t[s]: 3.70

-------- Electronic minimization ----------- Will mix electronic density at each iteration. SCF: Cycle: 0 Etot: -178.947364496766085 dEtot: -7.572e+00 |Residual|: 7.338e-01 |deigs|: 3.101e-01 t[s]: 1488.31 SCF: Cycle: 1 Etot: -179.489348320415218 dEtot: -5.420e-01 |Residual|: 3.761e-01 |deigs|: 6.769e-02 t[s]: 2486.33 SCF: Cycle: 2 Etot: -179.536736222504430 dEtot: -4.739e-02 |Residual|: 1.717e-01 |deigs|: 1.413e-02 t[s]: 3487.68 SCF: Cycle: 3 Etot: -179.555459498034054 dEtot: -1.872e-02 |Residual|: 5.688e-02 |deigs|: 2.448e-02 t[s]: 4495.66 SCF: Cycle: 4 Etot: -179.557511455980148 dEtot: -2.052e-03 |Residual|: 5.557e-02 |deigs|: 1.423e-03 t[s]: 5507.34 SCF: Cycle: 5 Etot: -179.536449225465788 dEtot: +2.106e-02 |Residual|: 1.762e-01 |deigs|: 2.455e-02 t[s]: 6514.30 SCF: Cycle: 6 Etot: -179.551403267170770 dEtot: -1.495e-02 |Residual|: 7.717e-02 |deigs|: 1.658e-02 t[s]: 7518.24 SCF: Cycle: 7 Etot: -179.555772854817434 dEtot: -4.370e-03 |Residual|: 5.818e-02 |deigs|: 1.478e-03 t[s]: 8524.13

shankar1729 commented 3 years ago

We actually implemented significant optimizations to EXX last year after the 1.6.0 release, including the ACE acceleration technique. Please use the latest version from git instead.

I did a quick benchmark with your input file, and it finishes one ionic step within half an hour on 12 cores. Note that with the ACE method, you will have an inner SCF loop, followed by an outer Vxx convergence loop, so the comparison cannot be per SCF step.

Best, Shankar