LLNL / Aluminum

High-performance, GPU-aware communication library
https://aluminum.readthedocs.io/en/latest/
Other
84 stars 21 forks source link

User buffer registration #224

Closed ndryden closed 7 months ago

ndryden commented 7 months ago

This adds support for user buffer registration to the NCCL API.

As this is just a performance optimization/hint, to simplify users' lives, all backends implement this, but may do nothing.

Two points for future improvements:

  1. Right now the NCCL backend does not use registered memory for its internal memory allocations (which show up in custom collective implementations). This is mainly because memory needs to be registered with a communicator, and I am not sure whether it can be shared across communicators.
  2. Our internal interface to the NCCL registration system does not really handle the case where the same buffer is registered with different communicators. This is basically for the same reason.

I intend to fix these once I better understand how registration works with multiple communicators.