QMCPACK / qmcpack

Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support
http://www.qmcpack.org
Other
292 stars 137 forks source link

Ion-ion energy convergence in quasi-2D systems #4546

Open kayahans opened 1 year ago

kayahans commented 1 year ago

Describe the bug Reference ion-ion energy of a quasi-2D system (bilayer graphene 3x3x1 tiling at gamma) converges to an incorrect value for bilayer separation at 7 Angstroms.

Due to the mismatch between the reference and QMCPACK ion-ion energies, after printing "Checking ion-ion Ewald energy against reference", QMCPACK terminates. I have used ewald lr_handler with increasing kc cutoff up to 100. Despite the increased cutoff, the problem is not resolved. For the set of ion-ion energy results below, I have used ccecp potentials and LDA functional:

kc_cutoff QMCPACK Reference QE
50 1131.3290657604 1131.2315948714 1131.32906586
75 1131.3290657129 1131.2315948714 1131.32906586
100 1131.3290656824 1131.2315948714 1131.32906586

QE values are printed in Ry for the primitive cell, therefore they are adjusted by multiplying with 9/2 for the 3x3x1 tiled supercell.

In the legacy code, the error was printed at every run I tested:

ERROR in ion-ion Ewald energy exceeds 0.0003 Ha/atom tolerance.

  Reference ion-ion energy: 1131.2315948714
  QMCPACK   ion-ion energy: 1131.3290657604
            ion-ion diff  : 0.097470889017131
            diff/atom     : 0.0027075246949203
            tolerance     : 0.0003

However, there were instances where the error was not printed in the batched code (see dmc folder in the attached files)

To Reproduce Steps to reproduce the behavior:

  1. Using QMCPACK 3.15.9 and QE 7.0
  2. using complex legacy cpu and batched cpu variants
  3. full program/test invocation command: srun -N 4 -c 32 --cpu-bind=cores -n 4 qmcpack_complex vmc.in.xml
  4. additional steps: None

Expected behavior Reference values in the table should match the QE and QMCPACK ion-ion values within some tolerance with increasing kc cutoff. Batched code should also print the error at every run if encountered.

System:

Additional context Add any other context about the problem here. Input/output files: ewald_sum.zip

ye-luo commented 1 year ago

The issue has nothing to do with drivers. Error printing happens during parsing Coulomb input. With rank 0 printing, other rank may run to error first and terminate all the ranks before rank 0 prints. Change to UniformCommunicateError addresses the issue. Please test out the fix.

jtkrogel commented 1 year ago

The printing part is secondary to the main issue here. The main problem is that the reference energy is incorrect (likely due to premature termination of the sum). This has not been fixed, right?