rapidsai / rmm

RAPIDS Memory Manager
https://docs.rapids.ai/api/rmm/stable/
Apache License 2.0
478 stars 195 forks source link

Update RMM adaptors, containers and tests to use get/set_current_device_resource_ref() #1661

Closed harrism closed 3 weeks ago

harrism commented 1 month ago

Description

Closes #1660.

This adds a constructor to each MR adaptor to take a resource_ref rather than an Upstream*. It also updates RMM to use get_current_device_resource_ref() everywhere: in containers, in tests, in adaptors, Thrust allocator, polymorphic allocator, execution_policy, etc.

Importantly, this PR also modifies set_current_device_resource() to basically call set_current_device_resource_ref(). This is necessary, because while RMM C++ uses get_current_device_resource_ref() everywhere, the Python API still uses the raw pointer API set_current_device_resource(). So we need the latter to update the state for the former. This is a temporary bootstrap to help with the refactoring.

Checklist

wence- commented 1 month ago

On the Python side, we need someone to own concrete objects (we can't just provide classes that wrap device_async_resource_ref since that is non-owning).

Adapters also need a way to keep their upstream alive.

We are removing the device_memory_resource inheritance on the C++ side, but I think until we have an owning any_resource available, we probably need to keep something on the Cython side.

Here is a very slide-ware sketch approach for discussion:

cdef extern from * nogil:
    """
    struct hack {
        std::function<rmm::device_async_resource_ref> _to_ref;
        hack() = default; // unsafe, but this is all hacky anyway
        rmm::device_async_resource_ref to_ref() { return _to_ref(); }
        template<typename T>
        hack(std::unique_ptr<T> obj) : _to_ref{[&obj]() -> rmm::device_async_resource_ref { return *obj; }} {};
    };
    """

cdef class DeviceMemoryResource:
   cdef hack obj;
   cdef device_async_resource_ref as_ref(self):
       return self.obj.to_ref()

cdef class CudaMemoryResource(DeviceMemoryResource):
   def __cinit__(self):
       self.obj = hack(move(make_unique[cuda_memory_resource]()))

cdef class PoolMemoryResource(DeviceMemoryResource):
   cdef DeviceMemoryResource upstream
   def __cinit__(self, DeviceMemoryResource upstream):
      self.upstream = upstream
      self.obj = hack(move(make_unique[pool_memory_resource](upstream.to_ref())))

This is effectively working around not (yet) having any_resource available by doing a minimal type-erased owning wrapper.

harrism commented 1 month ago

On the Python side, we need someone to own concrete objects (we can't just provide classes that wrap device_async_resource_ref since that is non-owning).

Probably should have this discussion somewhere else? (Opened #1662 in discussions) I was planning for this to be a C++-only PR, replacing #1634.

harrism commented 1 month ago

Let's not merge until I can do some more downstream testing, including JNI.

harrism commented 3 weeks ago

/merge