Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance portable GPU support
I've been working with deterministic test for jastrow(backflow) optimization and found that deterministic test is failing with a several number of coefficients in different machine.
Test was done in workstation, theta, and summit (CPU) using develop version with last commit (f46901f01736950c75203f37ec701aa44aa78841) on May/19/2021. Only J1 is used for the test.
Optimized J1 is identical up to 3 parameters. (det_qmc_short_opt_3.in.xml)
Any chance we are lucky and it works with one mpi tasks and one thread? i.e. parallel logic error vs optimization plumbing error? If the latter, perhaps backflow only but no jastrow is good?
Describe the bug
I've been working with deterministic test for jastrow(backflow) optimization and found that deterministic test is failing with a several number of coefficients in different machine.
Test was done in workstation, theta, and summit (CPU) using develop version with last commit (f46901f01736950c75203f37ec701aa44aa78841) on May/19/2021. Only J1 is used for the test.
Optimized J1 is identical up to 3 parameters. (det_qmc_short_opt_3.in.xml)
Workstation : 1.021369511 0.916486567 0.8654997938 Theta : 1.021369511 0.916486567 0.8654997938 Summit : 1.021369511 0.916486567 0.8654997938
But if number of coefficient is increased to 5, they are giving all different values. (det_qmc_short_opt_5.in.xml)
Workstation : 2.95646542 -2.478173825 -4.360350896 -3.72100857 -2.081630639 Theta : -1.215235277 -2.132996212 -1.642853768 -1.823485461 -0.9572376071 Summit : Fatal Error. Aborting at Invalid Matrix Diagonalization Function!
With the choice of random seeds, it is giving different errors that could be related with this issue.
To Reproduce Run with single mpi/thread
archive.zip
Expected behavior Pass deterministic test
System: Workstation(intel Xeon), theta, summit
Additional context Add any other context about the problem here.