lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
294 stars 100 forks source link

Allow `copy_<to/from>_buffer` to copy to/from device buffer #1502

Open hummingtree opened 1 month ago

hummingtree commented 1 month ago

The copy_<to/from>_buffer methods of the various field types assumes the buffer is on the host - this forbids one from doing split and join fields from device buffers when GPU aware MPI can be used to avoid the additional host-device memory copies.

Currently the usage is something like (in include/split_grid.h and lib/gauge_polyakov_loop.cu)

size_t bytes = field.TotalBytes();
void *buffer = pinned_malloc(bytes);
field.copy_to_buffer(buffer);

One could modify the copy_<to/from>_buffer methods to allow copying to/from device buffers when QUDA_ENABLE_GDR=1 and modify the usage above to use device_malloc instead of pinned_malloc when possible.