LLNL / Aluminum

High-performance, GPU-aware communication library
https://aluminum.readthedocs.io/en/latest/
Other
84 stars 21 forks source link

Improve NCCL buffer registration #231

Closed ndryden closed 3 months ago

ndryden commented 3 months ago

This registers memory on a per-communicator basis rather than globally. The prior approach was based on a misunderstanding of the NCCL docs.

This does not deregister memory when the comm is freed; we might do this in the future but it should not be necessary. This also doesn't check the case where memory might be registered multiple times.