lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
289 stars 97 forks source link

Feature/milc gaugefix ovr definition #1298

Closed detar closed 2 years ago

detar commented 2 years ago

Two small changes to enable access to gauge_fixingOVR via the MILC interface and to fix interval reporting.

mathiaswagner commented 2 years ago

Jenkins: Can one of the admins verify this patch?

weinbe2 commented 2 years ago

Thank you for the contribution @detar ! I just pushed a quick clang-format, and I'll do my own cursory test before approving, but this should be good to go.

weinbe2 commented 2 years ago

@detar I just ran a quick ks_spectrum workflow, replacing no_gauge_fix with coulomb_gauge_fix, and got the following error:

Fixing to Coulomb gauge
Starting Coulomb gauge fixing...
        Overrelaxation boost parameter: 1.800000e+00
        Tolerance: 2.000000e-06
        Stop criterion method: Delta
        Maximum number of iterations: 500
        Reunitarize at every 20 steps
        Print convergence results at every 20 steps
ERROR: CPU not supported yet (rank 0, host luna-0199, tunable_reduction.h:130 in void quda::TunableReduction2D::launch(T&, const quda::TuneParam&, const quda::qudaStream_t&, Arg&) [with Functor = quda::FixQualityOVR; bool enable_host = false; T = quda::array<double, 2>; Arg = quda::GaugeFixQualityOVRArg<double, QUDA_RECONSTRUCT_NO, 3>]())
       last kernel called was (name=N4quda15GaugeFixQualityINS_21GaugeFixQualityOVRArgIdL21QudaReconstructType_s18ELi3EEEEE,volume=16x16x16x16,aux=CPU,nParity=2,vol=65536stride=32768precision=8geometry=4Nc=3)

Investigating now... I did notice that, unlike other MILC <-> QUDA bindings, your modifications to MILC went directly through the quda.h interface, so maybe I need to do a little work around the edges with that. I'll keep you posted.

weinbe2 commented 2 years ago

Ah, here's the issue, the function you added to quda.h is already declared in quda_milc_interface.h:

$ grep -r "qudaGaugeFixingOVR" *
include/quda_milc_interface.h:  void qudaGaugeFixingOVR( const int precision,
include/quda.h:  void qudaGaugeFixingOVR(int precision, unsigned int gauge_dir, int Nsteps, int verbose_interval, double relax_boost,
lib/milc_interface.cpp:void qudaGaugeFixingOVR(int precision, unsigned int gauge_dir, int Nsteps, int verbose_interval, double relax_boost,

So it didn't need to go into quda.h---it may be worth trying just including quda_milc_interface.h instead of quda.h in the MILC file generic/gaugefix2.c.

I can work on addressing this over the course of the day, but if you also have some cycles please let me know.

weinbe2 commented 2 years ago

@detar I just made some changes to your branch, as well as the corresponding changes in MILC in the develop branch of https://github.com/lattice/milc_qcd ; I made a PR here for your convenience: https://github.com/milc-qcd/milc_qcd/pull/54

Can you test this combination and make sure it works?

weinbe2 commented 2 years ago

@maddyscientist this is ready for your review, it passed various tests on my end. Since I had a heavy hand in taking the PR an extra mile I don't see it as mine to approve and merge.

maddyscientist commented 2 years ago

@weinbe2 is there a simple test on the MILC side that can be used to verify this PR (e.g., for myself to run as part of this review)?

weinbe2 commented 2 years ago

@weinbe2 is there a simple test on the MILC side that can be used to verify this PR (e.g., for myself to run as part of this review)?

I can provide you a spectrum workflow with Coulomb gauge fixing enabled; I noticed there was some deviation from the CPU and the GPU side but that could be from a low convergence tolerance in MILC (1e-6).

detar commented 1 year ago

Hi Evan,

My mistake.  I had forgotten about quda_milc_interface.h.  I'll try that.

Best,

Carleton

On 7/15/22 9:38 AM, Evan Weinberg wrote:

Ah, here's the issue, the function you added to |quda.h| is already declared in |quda_milc_interface.h|:

|$ grep -r "qudaGaugeFixingOVR" * include/quda_milc_interface.h: void qudaGaugeFixingOVR( const int precision, include/quda.h: void qudaGaugeFixingOVR(int precision, unsigned int gauge_dir, int Nsteps, int verbose_interval, double relax_boost, lib/milc_interface.cpp:void qudaGaugeFixingOVR(int precision, unsigned int gauge_dir, int Nsteps, int verbose_interval, double relax_boost, |

So it didn't need to go into |quda.h|---it may be worth trying just including |quda_milc_interface.h| instead of |quda.h| in the MILC file |generic/gaugefix2.c|.

I can work on addressing this over the course of the day, but if you also have some cycles please let me know.

— Reply to this email directly, view it on GitHub https://github.com/lattice/quda/pull/1298#issuecomment-1185663436, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABP6HXRD2XFI5TFMJCULY5LVUGAXZANCNFSM53TQYH3A. You are receiving this because you were mentioned.Message ID: @.***>