Closed agrebe closed 4 months ago
This is great contribution, thanks @agrebe. Aside from my comment, please also clang-format the diff, and also feel free to add your name to the contributors in the README.
@Jenkins ok to test
Thank you for reviewing this so quickly -- I'm glad to be able to contribute given how useful this codebase has been for me!
I have been computing two-link smearing with QUDA and have made two optimizations to the smearing interface: 1) In
lib/milc_interface.cpp
, skip the call toloadGaugeQuda
if theuseResidentGauge
flag is set (since in this case, the 2-link field is already computed) 2) Simplify the four linear algebra operations in the loop inperformTwoLinkGaussianSmearNStep
inlib/interface_quda.cpp
to a single operation I have tested that this agrees with the previous implementation (using QUDA driven by MILC on a single GPU), and overall these changes increase performance by a factor of about 2 in our project.