With recent CUDA versions, we have the CUDA_MEMCPY3D_PEER struct, which is quite flexible. We also have a large host of copy functions - 40 all told - almost all of which are covered by a copy function taking this struct. And - we probably don't even cover all of its possibilities - w.r.t. inter-context copying.
We have (at least) two choices:
Make sure the individual copy functions also cover the inter-context case, or
Reduce, perhaps drastically, the number of copy functions, in favor of a copy builder
I'd appreciate some input from users of the library who have opined on design questions in the past, or have contributed code.
With recent CUDA versions, we have the CUDA_MEMCPY3D_PEER struct, which is quite flexible. We also have a large host of copy functions - 40 all told - almost all of which are covered by a copy function taking this struct. And - we probably don't even cover all of its possibilities - w.r.t. inter-context copying.
We have (at least) two choices:
I'd appreciate some input from users of the library who have opined on design questions in the past, or have contributed code.