lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
287 stars 94 forks source link

Match `x` and `p` in `computeCloverForce` and `computeCloverSigmaOprod`. #1424

Closed SaltyChiang closed 7 months ago

SaltyChiang commented 7 months ago

This small PR revert changes in https://github.com/lattice/quda/commit/fd30fd7790bf094242ccd7670e221a6878469bf6, and uses QudaInvertParam::dagger to decide the order of x and p in computeCloverForce. Now the order is consistent in computeCloverForce and computeCloverSigmaOprod in clover_force.cpp.

I'm trying to get a HMC integrator to return a reasonable delta_H (which should be small). I tried different parameters and noticed that I need to set QudaInvertParam::dagger = QUDA_DAG_YES for computeCloverForceQuda and QudaInvertParam::dagger = QUDA_DAG_NO for invertQuda in my HMC program, which makes sense.

But setting QudaInvertParam::dagger = QUDA_DAG_NO for computeCloverForceQuda and QudaInvertParam::dagger = QUDA_DAG_YES for invertQuda keeps returning a large delta_H. The order of x and p is static in computeCloverForce and dynamic in computeCloverSigmaOprod, which should be the reason why I cannot get a reasonable integrator.

This PR should affect the MILC convention of computeCloverForceQuda. I'm not familiar with MILC and need help from a MILC expert to confirm these changes work fine for MILC. (In fact, this PR makes computeCloverForce behave the same as before merging #1338).

I didn't change the CPU reference implementation, let me know if I should do that.

updateGaugeFieldQuda frees all the sloppy gauge fields in the end, and reallocates gaugePrecise only. This leaves all sloppy fields undefined and QUDA will dump in the next checkGauge. An extra loadGaugeQuda(nullptr, param) should regenerate all sloppy fields, but it cannot handle the nullptr correctly if param->use_resident_gauge == 0. Please let me know if you have any ideas about it.

A redundant declaration of freeSloppyCloverQuda is removed.

maddyscientist commented 7 months ago

Thanks for the PR @SaltyChiang. With this change, are you now seeing a small delta H?

@Jenkins ok to test

SaltyChiang commented 7 months ago

Yes, both use cases in the original post return a small delta H. Also I've tried

./tests/clover_force_test --dslash-type twisted-clover --compute-clover 1 --matpc odd-odd-asym --dim  4 4 4 4 --prec double --niter 1 --compute-clover-trlog 1 --verbosity verbose --kappa 1 --mu 1 --clover-csw 1 --solution-type mat-pc-dag-mat-pc --dagger

and the test passed.

PS: Shall we assert the matpc_type to be asymmetric as x and p don't share the same construction in the symmetric situation?

PPS: I want to edit my affiliation in README, and I noticed the last two contributors from Fermilab didn't follow the lexicographic order, maybe I can fix it.

PPPS: Do you think regenerating sloppy fields at the end of updateGaugeFieldQuda is necessary? As they are freed here: https://github.com/lattice/quda/blob/fd50676db06fc36efb3a791a3059c57cca70bb55/lib/interface_quda.cpp#L4677-L4681

maddyscientist commented 7 months ago

Yes, both use cases in the original post return a small delta H. Also I've tried

./tests/clover_force_test --dslash-type twisted-clover --compute-clover 1 --matpc odd-odd-asym --dim  4 4 4 4 --prec double --niter 1 --compute-clover-trlog 1 --verbosity verbose --kappa 1 --mu 1 --clover-csw 1 --solution-type mat-pc-dag-mat-pc --dagger

and the test passed.

Great, thanks for confirming this. No need to make any changes to the test code: it only tests tmLQCD conventions at the moment, and will enforce dagger / odd-odd-asym options.

PS: Shall we assert the matpc_type to be asymmetric as x and p don't share the same construction in the symmetric situation?

I believe the code should already do that (top of clover_force.cpp, upon entry into computeCloverForce).

PPS: I want to edit my affiliation in README, and I noticed the last two contributors from Fermilab didn't follow the lexicographic order, maybe I can fix it.

Sure thing, feel free to fix. Thanks.

PPPS: Do you think regenerating sloppy fields at the end of updateGaugeFieldQuda is necessary? As they are freed here:

https://github.com/lattice/quda/blob/fd50676db06fc36efb3a791a3059c57cca70bb55/lib/interface_quda.cpp#L4677-L4681

@weinbe2 this is your code I believe, so I defer this question to you.

SaltyChiang commented 7 months ago

I believe the code should already do that (top of clover_force.cpp, upon entry into computeCloverForce).

I missed that, thanks for pointing out this.

@weinbe2 FYI, computeGaugeForceQuda, computeGaugePathQuda, updateGaugeFieldQuda, projectSU3Quda, staggeredPhaseQuda, computeGaugeFixingOVRQuda and computeGaugeFixingFFTQuda share the similar issue.

weinbe2 commented 7 months ago

Hi @SaltyChiang

PPPS: Do you think regenerating sloppy fields at the end of updateGaugeFieldQuda is necessary? As they are freed here:

https://github.com/lattice/quda/blob/fd50676db06fc36efb3a791a3059c57cca70bb55/lib/interface_quda.cpp#L4677-L4681

This code has been around for quite a while, I just refactored it recently. This code likely dates back to some initial implementation with MILC, where only sloppy fat and long links are needed, not "regular" gauge links. I'm guessing that's why sloppy fields weren't created, they were never needed for a solve.

That said, as you noted, it could be an issue for apps that use the original gauge links (anything not using HISQ...) since the sloppy fields will be inconsistent. I can add that, but after looking at the code in interface_quda.cpp it's a bit more than a one-line change. I'll need to take this chunk of code: https://github.com/lattice/quda/blob/develop/lib/interface_quda.cpp#L638-L748 and put it in a separate utility function (which isn't the worst idea) so it can be easily called multiple places.

I can take care of this next week (it'll just require a good amount of testing, since that's pretty fundamental code...), if that's alright.

maddyscientist commented 7 months ago

Given @weinbe2's comment, we can move on to merge this branch I think.