Open mfoerste4 opened 2 months ago
Thanks @divyegala for the rewiew. I have adressed or commented on your immediate suggestions.
I see that the C and Python APIs need a copy and ultimately, this is because
rmm::device_buffer
in this structure as well as the model is owned by the struct/class. Instead, if we usestd::weak_ptr<rmm::device_buffer>>
or plainrmm::device_buffer&
then we can avoid copies and ownership issues by letting the C and Python API own the data but still allowing the C++ API to resize it.The C API can also release the shared_ptr/device_buffer to transfer ownership to the user, thus removing the need for a copy there.
The Python API is owned by us and so, we can create shared_ptr/device_buffer in the Cython layer and store them as class members.
I don't quite understand how this can be accomplished with the given device_buffer
implementation. Even if we pass the storages as unique pointers - we then can pass ownership of the whole device-buffer, but we will not be able to pass in/out actual memory allocations within the device_buffer.
For the python API we retrieve data as CumlArray
and want to pass the ptr into the device_buffer
. For the C-API we have fixed pointer style I/O where cannot simply switch to device_buffer*
. IIUC we would need a class like the device_buffer
but with additional capabilities to switch in between owning and not-owning.
@mfoerste4
Let me write an example for Python API and how to handle ownership, assuming the C++ API does not own the memory and instead holds onto references of the form rmm::device_buffer&
. As for the C API, I think you are right and there's no way for us to release the ownership of memory, but at least we can avoid copies in the Python world:
Python API:
As mentioned in the original issue, instead of CumlArray
we can use DeviceBuffer
from RMM Python
As seen in the .pxd
definition, the Python DeviceBuffer
class holds an underlying reference to the C++ object device_buffer
in the form of the property unique_ptr<device_buffer> c_obj
. Thus, the code looks like:
class SVM:
def __init__(self):
self.some_array = rmm.DeviceBuffer(...) # instead of CumlArray
def some_func(self):
cpp_func(deref(some_array.c_obj.get()), ...)
@divyegala , thanks for your suggestions. I have refactored the SvmModel to work with device_buffer*
in order to paste in the buffers after (de-)serialization without a copy.
As discussed in #6057 this is a PR to refactor the raw memory allocations in the svm model which are passed back to the caller.
I have removed all raw allocation pointers from the SVM model and replaced them by device_buffers. Since ownership is more restrictive now we do have additional data copies when:
CC @tfeher, @divyegala