apparent divergences in QUDA-MG in the HMC

kostrzewa commented 2 years ago

It seems that when the MG setup is refreshed for a particular monomial, the subsequent solve for another monomial (with tm_rho=0) fails to converge.

I wonder if running the refresh

https://github.com/etmc/tmLQCD/blob/77f264dcb334d696c848f7b5002fadf32f477287/quda_interface.c#L1500

for the 'cloverdetratio2light' parameters (which include a Hasenbusch mass shift tm_rho of the preconditioned fine system) messes with the null vectors.

It should be noted that the outer solver parameters definitely contain tm_rho:

https://github.com/etmc/tmLQCD/blob/77f264dcb334d696c848f7b5002fadf32f477287/quda_interface.c#L1780

If this would be the culprit, then one could modify the refresh procedure:

1) categorically set tm_rho to zero in the outer solver parameters (quda_mg_param.invert_param->tm_rho = 0) before the refresh

2) adopt quda_mg_param.invert_param->mu = -quda_input.mg_setup_2kappamu/2.0/g_kappa (only relevant for non-clover twisted-mass) as in the initial setup generation

3) refresh setup

4) reset params and run TM_QUDA_MG_SETUP_UPDATE if necessary for the current solve

5) move on to solving

This might alleviate these fluctuations. It should also be noted that (1) might play a role in the initial generation of the setup as well if one of the monomials with tm_rho != 0 is the one to initiate the MG setup.

kostrzewa commented 2 years ago

Input file for a run where this can be observed: hmc_cA211.075.24_out_of_max_twist_start.input.txt

kostrzewa commented 2 years ago

Excerpt from a log file where this can be observed (look for "2000 iterations"): cA211.075.24_init_therm.749.log.excerpt.txt

kostrzewa commented 2 years ago

First try at fixing this: #509 (doesn't seem to help)

etmc / tmLQCD

apparent divergences in QUDA-MG in the HMC #508