lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
279 stars 94 forks source link

Feature/mrhs prep #1437

Closed maddyscientist closed 5 months ago

maddyscientist commented 5 months ago

This is primarily a cleanup PR, paving the way for multi-RHS solvers:

weinbe2 commented 5 months ago

My initial visual review is done, everything looks good! I left a few comments, I'll give a second look soon just to be safe.

Correctness looks good for MILC spectrum workflows, so between that and ctest coverage I trust the CG solver updates.

I saw some issues with SD convergence with ctest + domain wall:

         41 - invert_test_mobius_eofa_sym_double (Failed)
         42 - invert_test_mobius_eofa_asym_double (Failed)

The culprit in both cases is SD:

[  FAILED  ] NormalEvenOdd/InvertTest.verify/sd_mat_pc_dag_mat_pc_normop_pc_double_l2, where GetParam() = (4, 4, 3, 8, 1, 1, (-2147483648, -2147483648, -2147483648), 1)
[  FAILED  ] NormalEvenOdd/InvertTest.verify/sd_mat_pc_dag_mat_pc_normop_pc_single_l2, where GetParam() = (4, 4, 3, 4, 1, 1, (-2147483648, -2147483648, -2147483648), 1)
[  FAILED  ] NormalEvenOdd/InvertTest.verify/sd_mat_pc_dag_mat_pc_normop_pc_half_l2, where GetParam() = (4, 4, 3, 2, 1, 1, (-2147483648, -2147483648, -2147483648), 1)
[  FAILED  ] NormalEvenOdd/InvertTest.verify/sd_mat_normop_pc_double_l2, where GetParam() = (4, 0, 3, 8, 1, 1, (-2147483648, -2147483648, -2147483648), 1)
[  FAILED  ] NormalEvenOdd/InvertTest.verify/sd_mat_normop_pc_single_l2, where GetParam() = (4, 0, 3, 4, 1, 1, (-2147483648, -2147483648, -2147483648), 1)
[  FAILED  ] NormalEvenOdd/InvertTest.verify/sd_mat_normop_pc_half_l2, where GetParam() = (4, 0, 3, 2, 1, 1, (-2147483648, -2147483648, -2147483648), 1)

It looks like it's on its way to converging, it's just not getting all the way to the target residual of ~1e-12 for double precision solves, which I imagine is consistent with domain wall solvers being ill-conditioned as a statement of chirality.

maddyscientist commented 5 months ago

SD solver is fixed for the Mobius EOFA case (made the condition number smaller to allow for more rapid convergence), and I also fixed an additional issue I found with the clover field. PR should be good now 🤞