qsimulate-open / bagel

Brilliantly Advanced General Electronic-structure Library
GNU General Public License v3.0
92 stars 44 forks source link

XMS-CASPT2 gradient parallelization bug - not reproduced #191

Closed nomotoatsu closed 2 years ago

nomotoatsu commented 4 years ago

Hi,

I calculated analytical gradient in XMS-CASPT2 (accurately, 4 electrons, 4 orbitals, 4 states, MS-MR scheme, with 0.1 imaginary shift, 23 atoms), and got slightly different results (error: 1-2 %, ~0.001 Hartree/Bohr) with the same input (input and output files attached below). My question is if there is a way to avoid it.

To be more precise, I believe that CASSCF and CASPT2 results are the same because "Permanent dipole moment: CASPT2 unrelaxed" was the same. On the contrary, "Permanent dipole moment: CASPT2 relaxed" varied by ~1 %, so part of pt2 Lagrangian calculation must've been wrong.

Thanks in advance for your help.

INPUT: { "bagel" : [

{ "title" : "molecule", "basis" : "cc-pvdz", "df_basis" : "cc-pvdz-jkfit", "angstrom" : false, "geometry" : [

{ "atom" : "C", "xyz" : [-0.02145406, 0.00746253, -0.22019465] }, { "atom" : "H", "xyz" : [-0.04143035, 0.00933147, 1.82790549] }, { "atom" : "C", "xyz" : [2.43529366, 0.00713938, -1.40829562] }, { "atom" : "H", "xyz" : [2.68003207, 0.01016862, -3.44084332] }, { "atom" : "C", "xyz" : [4.69968617, 0.00183492, 0.15446998] }, { "atom" : "O", "xyz" : [4.7131259, -0.00180847, 2.45040769] }, { "atom" : "O", "xyz" : [6.86113775, 0.00204468, -1.26339521] }, { "atom" : "C", "xyz" : [9.17011835, -0.00256436, 0.16777932] }, { "atom" : "H", "xyz" : [9.29425067, -1.68270462, 1.36773076] }, { "atom" : "H", "xyz" : [10.68927852, -0.00196154, -1.22690271] }, { "atom" : "H", "xyz" : [9.29758981, 1.67357724, 1.37296908] }, { "atom" : "C", "xyz" : [-2.29776885, 0.00474132, -1.47163167] }, { "atom" : "C", "xyz" : [-2.61282586, 0.00091841, -4.19834187] }, { "atom" : "C", "xyz" : [-4.67698489, 0.00496431, 0.01886324] }, { "atom" : "C", "xyz" : [-5.00898518, -0.00338072, -5.23892326] }, { "atom" : "H", "xyz" : [-0.96385285, 0.00184626, -5.41566367] }, { "atom" : "C", "xyz" : [-7.01696456, 0.00100155, -1.02418046] }, { "atom" : "C", "xyz" : [-7.28960477, -0.00470353, -3.67132753] }, { "atom" : "H", "xyz" : [-5.20663729, -0.00537816, -7.2856628] }, { "atom" : "H", "xyz" : [-8.67812441, 0.0011395, 0.19808864] }, { "atom" : "H", "xyz" : [-9.15395174, -0.00900265, -4.53503435] }, { "atom" : "O", "xyz" : [-4.39007536, 0.01012326, 2.61803961] }, { "atom" : "H", "xyz" : [-6.03024791, 0.01026121, 3.42450474] } ] }, { "title" : "load_ref", "file" : "cas", "continue_geom" : false }, { "title" : "casscf", "fci_algorithm" : "knowles", "nstate" : 4, "nact" : 4, "nclosed" : 45, "active" : [46,47,48,49], "chrge" : 0, "nspin" : 0, "thresh" : 1.0e-8, "thresh_micro" : 5.0e-6, "maxiter" : 200, "maxiter_micro" : 200 }, { "title" : "force", "target" : 1, "maxziter" : 300, "maxiter" : 500, "method" : [ { "title" : "caspt2", "smith" : { "method" : "caspt2", "maxiter" : 500, "ms" : "true", "xms" : "true", "sssr" : "false", "shift" : 0.1, "imag_shift" : true, "frozen" : true }, "nstate" : 4, "nact" : 4, "nclosed" : 45 } ] } ]}

OUTPUT: bgl_1.log bgl_2.log bgl_3.log

shiozaki commented 4 years ago

You could set the tighter threshold for CASPT2. Toru

On Nov 28, 2019, at 8:56 PM, nomotoatsu notifications@github.com wrote:

Hi,

I calculated analytical gradient in XMS-CASPT2 (accurately, 4 electrons, 4 orbitals, 4 states, MS-MR scheme, with 0.1 imaginary shift, 23 atoms), and got slightly different results (error: 1-2 %, ~0.001 Hartree/Bohr) with the same input (input and output files attached below). My question is if there is a way to avoid it.

To be more precise, I believe that CASSCF and CASPT2 results are the same because "Permanent dipole moment: CASPT2 unrelaxed" was the same. On the contrary, "Permanent dipole moment: CASPT2 relaxed" varied by ~1 %, so part of pt2 Lagrangian calculation must've been wrong.

Thanks in advance for your help.

INPUT: { "bagel" : [

{ "title" : "molecule", "basis" : "cc-pvdz", "df_basis" : "cc-pvdz-jkfit", "angstrom" : false, "geometry" : [

{ "atom" : "C", "xyz" : [-0.02145406, 0.00746253, -0.22019465] }, { "atom" : "H", "xyz" : [-0.04143035, 0.00933147, 1.82790549] }, { "atom" : "C", "xyz" : [2.43529366, 0.00713938, -1.40829562] }, { "atom" : "H", "xyz" : [2.68003207, 0.01016862, -3.44084332] }, { "atom" : "C", "xyz" : [4.69968617, 0.00183492, 0.15446998] }, { "atom" : "O", "xyz" : [4.7131259, -0.00180847, 2.45040769] }, { "atom" : "O", "xyz" : [6.86113775, 0.00204468, -1.26339521] }, { "atom" : "C", "xyz" : [9.17011835, -0.00256436, 0.16777932] }, { "atom" : "H", "xyz" : [9.29425067, -1.68270462, 1.36773076] }, { "atom" : "H", "xyz" : [10.68927852, -0.00196154, -1.22690271] }, { "atom" : "H", "xyz" : [9.29758981, 1.67357724, 1.37296908] }, { "atom" : "C", "xyz" : [-2.29776885, 0.00474132, -1.47163167] }, { "atom" : "C", "xyz" : [-2.61282586, 0.00091841, -4.19834187] }, { "atom" : "C", "xyz" : [-4.67698489, 0.00496431, 0.01886324] }, { "atom" : "C", "xyz" : [-5.00898518, -0.00338072, -5.23892326] }, { "atom" : "H", "xyz" : [-0.96385285, 0.00184626, -5.41566367] }, { "atom" : "C", "xyz" : [-7.01696456, 0.00100155, -1.02418046] }, { "atom" : "C", "xyz" : [-7.28960477, -0.00470353, -3.67132753] }, { "atom" : "H", "xyz" : [-5.20663729, -0.00537816, -7.2856628] }, { "atom" : "H", "xyz" : [-8.67812441, 0.0011395, 0.19808864] }, { "atom" : "H", "xyz" : [-9.15395174, -0.00900265, -4.53503435] }, { "atom" : "O", "xyz" : [-4.39007536, 0.01012326, 2.61803961] }, { "atom" : "H", "xyz" : [-6.03024791, 0.01026121, 3.42450474] } ] }, { "title" : "load_ref", "file" : "cas", "continue_geom" : false }, { "title" : "casscf", "fci_algorithm" : "knowles", "nstate" : 4, "nact" : 4, "nclosed" : 45, "active" : [46,47,48,49], "chrge" : 0, "nspin" : 0, "thresh" : 1.0e-8, "thresh_micro" : 5.0e-6, "maxiter" : 200, "maxiter_micro" : 200 }, { "title" : "force", "target" : 1, "maxziter" : 300, "maxiter" : 500, "method" : [ { "title" : "caspt2", "smith" : { "method" : "caspt2", "maxiter" : 500, "ms" : "true", "xms" : "true", "sssr" : "false", "shift" : 0.1, "imag_shift" : true, "frozen" : true }, "nstate" : 4, "nact" : 4, "nclosed" : 45 } ] } ]}

OUTPUT: bgl_1.log https://github.com/nubakery/bagel/files/3903543/bgl_1.log bgl_2.log https://github.com/nubakery/bagel/files/3903545/bgl_2.log bgl_3.log https://github.com/nubakery/bagel/files/3903548/bgl_3.log — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nubakery/bagel/issues/191?email_source=notifications&email_token=AAKDMIXKOXGQSFLTPZZYLK3QWCAFHA5CNFSM4JS2JFSKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H4ZFYGA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKDMIX7YNDNZSWTKY7SPX3QWCAFHANCNFSM4JS2JFSA.

nomotoatsu commented 4 years ago

Setting "thresh" higher value (up to 1.0e-14) didn't work. Higher value of "thresh_overlap" didn't either.

To make matters worse, the output value of "Permanent dipole moment: CASPT2 relaxed" and nuclear energy gradient are sometimes too high like below (where "thresh" : 1.0e-14):

    43      0.0000000003     0.0000000073      0.56
   - Z-CASSCF solution                        22.13
* Permanent dipole moment: CASPT2 relaxed
       (  -89.372458,    -0.040033,    24.849114) a.u.
nomotoatsu commented 4 years ago

In addition, setting "nstate" 2 (originally 4) seemed to solve the problem. I calculated six times with the same input and got the nuclear gradients with no significant error.

Do you have any idea what might cause the error in the nuclear gradients when "nstate" is set to more than 2?

shiozaki commented 4 years ago

Thanks, that information is helpful. One possibility is that the 4th state is nearly degenerate with the 5th state. Can you run 5-state calculation and report to us what you see? Now that you mentioned, the slow convergence of CASSCF concerns me - the solver is second-order, and it should not linger that way; there must be something going on.

On a side, I have asked @jwpk1201 Prof Jae Woo Park, who wrote the imaginary shit code to chime in.

Jae Woo: please let us know what you find.

nomotoatsu commented 4 years ago

SA-5-CAS(8e,8o)-PT2 (SSSR) still produces gradients with error of ~0.001 hartree/bohr (To conduct 5-state calculation, I expanded the CAS space. I also tested SA-3/4/5-CAS(8e,8o), SA-3/4-CAS(6e,6o) in vain) .

I also realized that some other calculation conditions (real and imaginary shift, MS-SR and SS-SR) resulted in the same kind of problem.

shiozaki commented 4 years ago

Any idea @jwpk1201 ?

jwpk1201 commented 4 years ago

Hi @nomotoatsu and Toru,

I tested the input in my cluster without parallelism, and I did not observe this problem. The output shows that 32 processes were running the calculation in parallel, which is beyond the level that we were testing (I tested the code in zinc using ~16 processes with ~5 states long ago, and did not have such error). I suspect that there might be a MPI synchronization error when evaluating CASPT2 CI derivative, with nstate >= 3. Unfortunately, I could not test SMITH parallelization on my cluster (The SMITH parallelization does not work with my binary, and I was not able to figure out the reason).

@nomotoatsu : could you run your job with one MPI process (with threading) and check whether this happens?

Sincerely, Jae Woo

nomotoatsu commented 4 years ago

The problem of the gradient error was solved by running with one MPI process. I tested the input of SA-3-CAS(8e8o)-PT2 nine times. Thanks a lot for helping me out.

shiozaki commented 4 years ago

Hi Jae Woo @jwpk1201

Let's fix this. I am looking at the logic in

CASPT2::CASPT2::do_rdm_deriv(double factor)

in CASPT2_contract.cc and I really don't understand what's going on. Could you explain to me what these condition branches are?

This cannot be left unfixed.

Also where else you think suspicious?

shiozaki commented 4 years ago

I just ran the calculation at the beginning of the thread multiple times (with mpirun -n32), and cannot reproduce the error. I get exactly the same results as that obtained by a serial run. Without further information, it is hard to debug this... My environment is:

Ubuntu 18.04, Intel MPI/MKL 2019.4, GCC 7.4.0

shiozaki commented 2 years ago

Since there is no follow up, let me close this for now. If there are additional feedbacks we can reopen.