lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
289 stars 97 forks source link

Reconstruct cannot be used with non-zero Naik-episilon correction #718

Closed maddyscientist closed 5 years ago

maddyscientist commented 6 years ago

At present, when running with MILC, QUDA's gauge reconstruct support is enabled uniformly for all solvers regardless with the QUDA_MILC_HISQ_RECONSTRUCT environment variable. However, this is problematic, since reconstruction cannot be used with a non-zero Naik-epsilon correction. Thanks to @weinbe2 for realizing this.

We need to make sure that the MILC interface is intelligent enough to know if it is safe to use reconstruct on a per-solve basis. E.g., in 2 + 1 + 1 RHMC, we can only use reconstruct in the light and strange, but not the charm.

Once I've scoped this, this would be a good fix to put in QUDA 0.9.1. I imagine this is an easy fix.

@stevengottlieb and @detar: I believe one of you observed something funny with reconstruct enablement with MILC. Could have been what you observed?

detar commented 6 years ago

Hi Kate,

I am not familiar with the problem Eric refers to.  With MILC+ QUDA, I don't believe we have ever used QUDA's reconstruct.  If we need the solution on the full lattice, we do the checkerboard preconditioning ourselves, call QUDA for a single checkerboard solve, reconstruct, and then polish the reconstructed solution on the other checkerboard with a second QUDA single-checkerboard solve.  The second solve usually takes only a few iterations.

Best,

Carleton

On 8/23/18 5:55 PM, maddyscientist wrote:

At present, when running with MILC, QUDA's gauge reconstruct support is enabled uniformly for all solvers regardless with the |QUDA_MILC_HISQ_RECONSTRUCT| environment variable. However, this is problematic, since reconstruction cannot be used with a non-zero Naik-epsilon correction. Thanks to @weinbe2 https://github.com/weinbe2 for realizing this.

We need to make sure that the MILC interface is intelligent enough to know if it is safe to use reconstruct on a per-solve basis. E.g., in 2

  • 1 + 1 RHMC, we can only use reconstruct in the light and strange, but not the charm.

Once I've scoped this, this would be a good fix to put in QUDA 0.9.1. I imagine this is an easy fix.

@stevengottlieb https://github.com/stevengottlieb and @detar https://github.com/detar: I believe one of you observed something funny with reconstruct enablement with MILC. Could have been what you observed?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lattice/quda/issues/718, or mute the thread https://github.com/notifications/unsubscribe-auth/AF_j3imqc-I1uoRVACiYXRvGNfw_UoTlks5uT0D_gaJpZM4WKiIJ.

maddyscientist commented 6 years ago

Hi @detar

I was referring to long-link gauge reconstruction, which can be set to 13 or 9 numbers to reduce memory traffic. This is what breaks down when using non-zero naik epsilon since the link ceases to be U(3).

Regarding the checkerboard solve and reconstruction, this is something that should really be done all directly in QUDA since the overhead of doing on this on the CPU will have a non-trivial Amdahl effect.

weinbe2 commented 6 years ago

As an update, @maddyscientist and I figured out an easy solution to this problem: the factor of -1/24 on the Naik term similarly breaks the link being U(3), so fixing this perceived issue with recon-13 is as simple as replacing -1/24 with -(1+epsilon)/24 in the appropriate places. I've worked out where these factors need to go in my feature/hisq-unit branch, and Kate's taking it from there.

maddyscientist commented 5 years ago

Closing issue (was fixed with #717 which added support for reconstruct with non-zero naik epsilon)