lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
289 stars 97 forks source link

MultiShift reports wrong number of iterations #147

Closed mathiaswagner closed 10 years ago

mathiaswagner commented 10 years ago

We noticed that the number of iterations reported back to MILC is wrong:

MultiShift CG: Converged after 9123 iterations
...

WARNING: Check number of iterations to be reported to MILC 4682860 4682860
CONGRAD5: time = 4.926745e+01 (multicg_offset_QUDA D) masses = 19 iters = 4682860 mflops = 3.712593e+07

There are a few iterations done for refinement but that still does not sum up.

I have added the line starting with WARNING to check whether the error comes from passing the number iterations to MILC in the milc interface. The line starting with warning comes from the milc interface and does

 // return the number of iterations taken by the inverter
  *num_iters = invertParam.iter;
  warningQuda("Check number of iterations to be reported to MILC %i %i",*num_iters, jnvertparam.iter);

I have not yet checked the numbers for invertParam.iter at different points in interface_quda.cpp

mathiaswagner commented 10 years ago

A small update from check in interface_quda.cpp. After MultiShift everything is fine. Then

MultiShift CG: Converged after 9143 iterations
...
// Refinement here
...
CG: Convergence at 16 iterations, L2 relative residual: iterated = 5.121317e-16, true = 5.121317e-16
WARNING: Check number of iterations to be reported to Interface after Refinement 18302 

Do a little math 9143 * 2 + 16 = 18302

I have not yet found where that happens, but it seems to be in the refinement step.

mathiaswagner commented 10 years ago

Seems to related to the combination of

 SolverParam solverParam(*param);
...
solverParam.updateInvertParam(*param,i);

solverParam.iter is initialized to param.iter, then increase by number of iterations for refinement and then added back to param.iter.

Fixed in 168ee36306ca9fba6f5c9cfed89d4f522aeffc10

maddyscientist commented 10 years ago

Closing this issue.