lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
287 stars 94 forks source link

Feature/gaugefield unity #1384

Closed maddyscientist closed 10 months ago

maddyscientist commented 1 year ago

This PR's primary purpose is to unify the cpuGaugeField and cudaGaugeField classes, along with an assortment of other cleanups.

Outstanding items

weinbe2 commented 1 year ago

I've reviewed every file not called interface_quda.cpp and I'm a big fan of all of the cleanup. I'll need to go through interface_quda.cpp with a fine-tipped comb before I feel good about it!

weinbe2 commented 1 year ago

Why was timing QUDA_PROFILE_INIT removed in a handful of interface functions? (I know I mentioned this in one review comment, but it seems to appear in multiple places.)

weinbe2 commented 1 year ago

I've completed my visual review of interface_quda.cpp and left a handful of comments/questions, most of which probably only beg clarification. I'll start doing some explicit tests soon.

maddyscientist commented 12 months ago

@weinbe2 Ok, all your comments should be addressed now.

maddyscientist commented 11 months ago

This has now been tested with MILC RHMC, and things are looking good.

weinbe2 commented 10 months ago

Tentative approve, I noticed a few cosmetic things, and this needs a fresh merge of develop + clang-format. After that it's good to go!

maddyscientist commented 10 months ago

@SaltyChiang I have merged in your changes into this branch, that will shortly be merged into develop.

One line I have not included is this one:

gauge[d] = (param.gauge) ? ((void **)param.gauge)[d] : nullptr;

as this line doesn't make sense to me. If it's a reference to a pre-existing field, why should the value be a nullptr? Can you explain why this change is needed?

Thanks

SaltyChiang commented 10 months ago

Hi @maddyscientist, there are some if statements about the input gauge link, like https://github.com/lattice/quda/blob/6e90b49a2d12891f2f68e35b7cfa7defbd506526/lib/interface_quda.cpp#L3873 and https://github.com/lattice/quda/blob/6e90b49a2d12891f2f68e35b7cfa7defbd506526/lib/interface_quda.cpp#L3903. I believe the logic here is originally used for QUDA_MILC_GAUGE_ORDER, but this will fail if QUDA_QDP_GAUGE_ORDER is used here. For example, if the ulink is nullptr here, the unitarized fat7 link will not be calculated, which is right; but cpuUnitarizedLink defined in https://github.com/lattice/quda/blob/6e90b49a2d12891f2f68e35b7cfa7defbd506526/lib/interface_quda.cpp#L3847 fails now as you cannot get ((void **)nullptr)[d]. A null QUDA_QDP_GAUGE_ORDER field should be something like void *ulink[4] = {nullptr, nullptr, nullptr, nullptr} but now the if (ulink) will return true which we do not expect.

So I made all four reference fields nullptr when the input value itself is nullptr and I think this will not affect the existing logic.

maddyscientist commented 10 months ago

@SaltyChiang thanks for the explanation. I have fixed my branch accordingly (https://github.com/lattice/quda/pull/1384/commits/4c308f6f72a2239e97f3423137459e049dc56b03), so it should work for your usecase. If you could test this branch, that would be appreciated, thanks.

SaltyChiang commented 10 months ago

Hi @maddyscientist, very sorry for the late feedback. I think it should be !unitarizedLink.StaggeredPhaseApplied() && param->staggered_phase_applied here. https://github.com/lattice/quda/blob/0c016af709ad6a3c249540fdad809c8d18abc226/lib/interface_quda.cpp#L3748-L3749

Edit: I believe gauge_param.location should be changed to QUDA_CUDA_FIELD_LOCATION here. https://github.com/lattice/quda/blob/0c016af709ad6a3c249540fdad809c8d18abc226/lib/interface_quda.cpp#L765-L774

maddyscientist commented 10 months ago

@SaltyChiang thanks for pointing put my bug. I've pushed a fix to my next PR that will be merged (#1393). This is a fairly simple PR, so will likely be merged soon.

SaltyChiang commented 10 months ago

@maddyscientist thanks for the fix. You might miss another issue that will happen while saving the smeared gauge through saveGaugeQuda(). I updated the comment above and I repeat it here.

Edit: I believe gauge_param.location should be changed to QUDA_CUDA_FIELD_LOCATION here.

https://github.com/lattice/quda/blob/0c016af709ad6a3c249540fdad809c8d18abc226/lib/interface_quda.cpp#L765-L774

maddyscientist commented 10 months ago

Thanks again @SaltyChiang. Fix is here 44c41006158e493517452210a7a1d1e55412d63f (in #1393).