Memory registration when using IB is costly, so we implement UCXBufferCommunicator to get away with it. However, UCXBufferCommunicator introduces additional overhead. A better way to handle this is we can preregister the whole RMM memory pool and just uses the regular UCX communicator.
One potential way to implement this is to implement a new memory resource class, just like the default CUDA memory resource currently in RMM, but registered, and uses this new memory resource class as the upstream allocator for pool_memory_resource.
Memory registration when using IB is costly, so we implement
UCXBufferCommunicator
to get away with it. However,UCXBufferCommunicator
introduces additional overhead. A better way to handle this is we can preregister the whole RMM memory pool and just uses the regular UCX communicator.One potential way to implement this is to implement a new memory resource class, just like the default CUDA memory resource currently in RMM, but registered, and uses this new memory resource class as the upstream allocator for
pool_memory_resource
.