NVIDIA / nccl

Optimized primitives for collective multi-GPU communication
Other
3.27k stars 826 forks source link

what will happen if i call ncclAbort after ncclSend #1481

Open freshduer opened 1 month ago

freshduer commented 1 month ago

if we set ncclCommInitConfig into non-blocking mode

then one rank may be aborted

will the nccl send/recv kernel occupy the GPU resources?