microsoft / mscclpp

MSCCL++: A GPU-driven communication stack for scalable AI applications
MIT License
246 stars 38 forks source link

Double buffering for NCCL APIs #324

Closed caiomcbr closed 3 months ago

caiomcbr commented 4 months ago

Using two scratch buffers in each peer to exchange data.