Open freshduer opened 1 month ago
if we set ncclCommInitConfig into non-blocking mode
then one rank may be aborted
will the nccl send/recv kernel occupy the GPU resources?
if we set ncclCommInitConfig into non-blocking mode
then one rank may be aborted
will the nccl send/recv kernel occupy the GPU resources?