KhronosGroup / OpenCL-Docs

OpenCL API, OpenCL C, Extensions, SPIR-V Environment Specs, Ref page, and C++ for OpenCL doc sources.
Other
356 stars 113 forks source link

OpenCL 3.0 extentions for remote DMA? #281

Open eladmaimoni opened 4 years ago

eladmaimoni commented 4 years ago

On a lot of scenarios, GPUs and other accelerators are used in combination with each other, and in communicate with other non-accelerator devices such as NIC cards, proprietary FPGAs, frame grabbers etc.

Currently, Memory transfers between devices have to travel through host memory, unless special extensions are used.

Both NVidia and AMD offer extensions to support remote DMA with other devices sharing the same PCIe bus: GPUDirect RDMA (Cuda-linux only) and DirectGMA (AMD))

These extensions are almost trivial to implement on the driver level - merely exposing a physical bus address to which another device can potentially perform DMA.

Their usefulness of direct memory access between devices without having to go through host-memory is enormous, and in my opinion is a natural feature for an API such as OpenCL. Which strives to be available on a broad variety of device configurations.

Has the working group considered such extensions? (It is my understanding the the new async-dma extentions are something completely different)

ayalz commented 4 years ago

The new async_copy and async_fence extensions continue to address data transfers between global and local memories, within the context of a single work-group.

Would be interesting to see if/how they could be further extended.