NVIDIA / nccl

Optimized primitives for collective multi-GPU communication
Other
3.28k stars 831 forks source link

How to understand "bank" in net.cc? #1293

Closed dearsxx0918 closed 6 months ago

dearsxx0918 commented 6 months ago

Hi sjeaugey, I'm reading the latest NCCL code, I found that we have classified different cases in tranpsort/net.cc to make code more simple. But it make me more confused, especially about the "bank" in the code. Can you help to explain why we can do things like below? struct ncclSendMem sendMem = (struct ncclSendMem) ((((map)->offsets.sendMem >> 29) == 0) ? __null : (map)->mems[((map)->offsets.sendMem >> 30)].gpuPtr + ((map)->offsets.sendMem & 0x1fffffff));

It seems we add some offset on the "gpuPtr" which was allocated or reserved from cuda APIs.

Best Regards, -Edda