[QST] Move `cudf.Buffer` to `rmm`

shwina commented 4 years ago

Question regarding moving cudf.Buffer to rmm:

rmm.DeviceBuffer is a Cython wrapper around the C++ class rmm::device_buffer.
cudf.Buffer more generally represents an untyped device memory allocation:

buf = Buffer(data=ptr, size=size, owner=python_obj)

# buf represents a device memory allocation
# with address `ptr`, of size `size` bytes,
# and keeps a reference to `python_obj`
# who is the owner of that memory.
# For RMM allocated memory, the owner
# is a `rmm.DeviceBuffer

A Buffer can be constructed from any object exposing __array_interface__ or __cuda_array_interface__, e.g., CuPy arrays, numpy arrays, etc.,

Does it make sense for Buffer to be moved to RMM?

jakirkham commented 4 years ago

Current cuDF implementation is here for context.

jrhemstad commented 4 years ago

Granted it's on the Python level where I'm not too concerned, but this kinda sounds like scope creep to me.

My hope would be to keep RMM as narrowly focused as we can.

jakirkham commented 4 years ago

Maybe a little. Though am less worried about that personally. Where else would you imagine it living (if not RMM)?

jrhemstad commented 4 years ago

I'm a big fan of "Do One Thing". Scope creep is how libraries become complicated behemoths that are difficult to maintain and change.

This issue is similar to why I pushed back on https://github.com/rapidsai/rmm/pull/220. RMM shouldn't be concerned about device memory allocated by anything other than RMM. Anything beyond that is the concern of another layer or library.

I don't have sufficient Python expertise to know where Buffer should live, but I'm not a big fan of just tacking it onto RMM simply because it is convenient.

kkraus14 commented 4 years ago

From my view I'm not viewing this as tacking this onto rmm because it's convenient, in fact it's a little bit inconvenient if anything. I view this as managing memory that rapids libraries could potentially use, it just so happens that the memory wasn't allocated by the RMM allocator.

github-actions[bot] commented 3 years ago

This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

github-actions[bot] commented 3 years ago

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

harrism commented 3 years ago

Can you just replace cudf.Buffer with rmm.DeviceBuffer?

shwina commented 3 years ago

Unfortunately no, because a cudf.Buffer could contain arbitrary device memory backed objects underneath it, such as CuPy/Numba arrays, whereas a rmm.DeviceBuffer is owning.

github-actions[bot] commented 3 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

harrism commented 2 years ago

Unfortunately no, because a cudf.Buffer could contain arbitrary device memory backed objects underneath it, such as CuPy/Numba arrays, whereas a rmm.DeviceBuffer is owning.

This sounds like what in C++ we call span.

jakirkham commented 2 years ago

Yeah that seems similar.

Basically cudf.Buffer is a non-owning view onto the memory. It holds a reference to the object that owns the memory.

vyasr commented 2 years ago

I hadn't seen this issue before, but I recently started thinking about this after a conversation with @shwina and @jrhemstad about GPU memory management in Python. It seems like what we're looking for is a generic gpumemoryview with similar semantics to Python's memoryview, i.e. a standard representation for a non-owning view into memory. This object would be agnostic to the source of the underlying allocation. Most libraries could operate on the view alone (similar to libcudf C++ algorithms) and Python's reference counting logic would handle issues of scope. If a library needed to allocate memory internally (e.g. upon creation of a cudf.DataFrame) it could create the required rmm buffers then immediately generate the view and store it under the dataframe's columns exactly as cudf.Buffer is used right now. This approach would allow us to think bigger than RAPIDS alone so that we could also naturally consume data owned by CuPy, PyTorch, TensorFlow, and others.

If we went this route there would be a number of open questions to address here. Such an object could work directly with __cuda_array_interface__, but that would differ from how Python currently works: a memoryview is designed to work with the Python buffer protocol at the C level, while numpy (and numpy-like array libraries) are what consume __array_interface__. On the other hand, developing a GPU analog to the buffer protocol is probably unnecessary given the degree to which the ecosystem has centralized around CAI, so that semantic inconsistency is probably preferable to trying to overengineer a C API. There's also the original question on this thread of where such an object might live. I would probably advocate for a standalone package at least to start with, but I could see a case for rolling it into CUDA Python or something related to it.

CC @gmarkall @leofang

jakirkham commented 2 years ago

Yeah we already did this in ucx-py with Array. We could refactor this out into a separate thing we depend on.

cc @pentschev @madsbk @quasiben

leofang commented 2 years ago

I have been devising an interface for C/C++ libraries to share a memory pool, and I would need a way to expose this interface all the way to Python so that users can set it from Python. This sounds like a good idea worth exploring.

jrhemstad commented 2 years ago

i.e. a standard representation for a non-owning view into memory.

I thought that was the purpose of CAI? What's the difference between gpumemoryview and CAI?

jakirkham commented 2 years ago

It can help to have something in Cython as it can expose Python & C APIs

vyasr commented 2 years ago

That's true, but in my view the distinction is a little more fundamental. CAI is purely descriptive, so objects implementing the CAI must always be converted into some internal representation like cudf.Buffer that can be operated on programmatically. A gpumemoryview would replace that so that various libraries don't need to reinvent the wheel. To @jrhemstad's (much earlier) point about scope creep, I agree that the idea of a generic view into memory is out of scope for RMM because the underlying allocation need not come from RMM. CAI was developed so that different libraries don't need to implement a standard API in order to be interoperable. Having a standalone object representation of the allocation represented by CAI would improve that interoperability and provide a standard interchange object rather than a format alone. As long as the object itself supported the CAI we could pass it to libraries that support CAI even if they were unaware of gpumemoryview and it would just work, but libraries could optionally build around gpumemoryview (instead of e.g. cudf.Buffer) and avoid needing to perform any preprocessing when provided one.

jakirkham commented 2 years ago

Yep I understand. The Array object in UCX-Py does both those things. We can rip it out into a stand alone library

vyasr commented 2 years ago

@jakirkham Sorry, didn't mean that as a correction, more as an extended explanation to answer @jrhemstad's question.

madsbk commented 2 years ago

Yep I understand. The Array object in UCX-Py does both those things. We can rip it out into a stand alone library

I am very much in favor of a stand alone library implementing something like Array and all its nice-to-have utility functions.

harrism commented 2 years ago

Happy my comment helped kickstart this discussion again! Sounds like there is agreement on a direction forward.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

vyasr commented 2 years ago

I'm going to work on documenting next steps for this sometime soon.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

github-actions[bot] commented 2 years ago

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

rapidsai / rmm

[QST] Move `cudf.Buffer` to `rmm` #227