lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
279 stars 94 forks source link

Constrain the tuning space for host <-> device MILCSiteOrder gauge field copies #1451

Closed weinbe2 closed 3 months ago

weinbe2 commented 3 months ago

At larger local volumes, the host <-> device MILCSiteOrder tuning can take a long time since at best it's constrained by raw PCIe bandwidth. Trimming the tuning space would give a non-negligible boost to autotuning time.

maddyscientist commented 3 months ago

1441 will fix this (disable shared memory tuning for the copy gauge kernels)

weinbe2 commented 3 months ago

Closed by #1441