Closed joerowell closed 2 months ago
It looks like this was removed because we no longer had access to comm
but we forgot to put it back. We'll fix that.
Here is a patch in the meantime:
Thank you!
Actually, after reviewing the code again, it looks like the patch is not necessary. The affinity is set in the main thread during init and the service and progress threads should both be launched at that time, when the affinity is set. There should therefore be no need to set the affinity again inside the thread main function.
Did you see any difference with the patch applied?
I see that in nccl 2.18.1-1 the following line was commented out:
https://github.com/NVIDIA/nccl/blob/178b6b759074597777ce13438efb0e0ba625e429/src/proxy.cc#L1400
Other than the fact that the argument to
ncclProxyService
changed, may I ask why this change was made?