rapidsai / rmm

RAPIDS Memory Manager
https://docs.rapids.ai/api/rmm/stable/
Apache License 2.0
472 stars 192 forks source link

[FEA] Add `host_buffer` class #260

Open jrhemstad opened 4 years ago

jrhemstad commented 4 years ago

Is your feature request related to a problem? Please describe.

It has come up in several independent conversations w/ @jakirkham and others that it would be nice to have RMM provide a corollary to device_buffer for host allocated memory. See conversation in https://github.com/rapidsai/rmm/pull/141 and https://github.com/rapidsai/rmm/issues/216

In short, it would be convenient to have a common abstraction for host memory allocations used by RAPIDS libraries. This would allow for things like having pinned host memory allocations for use in more performant device to host memory spilling.

Describe the solution you'd like

Add a rmm::host_buffer class to act as a host memory corollary to device_buffer, i.e., untyped, uninitialized host memory allocation.

Additional context

Note that this opens a (few) sizable can of worms of questions that will eventually need to be answered. From my comment here

Here's what I think the simplest and least effort path forward is:

jakirkham commented 4 years ago

cc @kkraus14 @pentschev

kkraus14 commented 4 years ago

If a host memory pool is required, only support it in C++17 and beyond.

I assume this would be a crazy effort to backport to C++14? We typically get pushback from requiring newer compiler versions / system libraries.

jrhemstad commented 4 years ago

I assume this would be a crazy effort to backport to C++14? We typically get pushback from requiring newer compiler versions / system libraries.

Yeah, it's not really possible to backport. You'd be better off just re-implementing all of the memory pool logic or using some other open source host memory pool.

That said, we can probably insulate user libraries from needing C++17. We can wrap the C++17 bits in include guards and throw an error if someone tries to use the host memory pools pre-c++17.

Note that providing a host memory pool is orthogonal to providing a host_buffer.

harrism commented 4 years ago

I would leave host memory pools to future work.

kkraus14 commented 4 years ago

Yeah, it's not really possible to backport. You'd be better off just re-implementing all of the memory pool logic or using some other open source host memory pool.

That said, we can probably insulate user libraries from needing C++17. We can wrap the C++17 bits in include guards and throw an error if someone tries to use the host memory pools pre-c++17.

Note that providing a host memory pool is orthogonal to providing a host_buffer.

Yea I understand, just that there likely will be enterprise customers that want the memory pool and C++14 support, but beggars can't be choosers πŸ˜„.

jakirkham commented 4 years ago

Would using Boost help?

harrism commented 4 years ago

We don't want RMM to depend on Boost.

jakirkham commented 4 years ago

Yeah that makes sense. Just thinking about what alternatives we might have πŸ™‚

jakirkham commented 4 years ago

What if we take this as an opportunity to start using the Conda compilers? That would put us on GCC 7.3.0, which has C++17 support (unless I've missed something). Besides this was something we were planning to do anyways. ( https://github.com/rapidsai/cudf/issues/1210 )

harrism commented 4 years ago

I think we should just stick to C++14, and leave host memory pools for future work.

kkraus14 commented 4 years ago

What if we take this as an opportunity to start using the Conda compilers? That would put us on GCC 7.3.0, which has C++17 support (unless I've missed something). Besides this was something we were planning to do anyways. ( rapidsai/cudf#1210 )

That's only for conda packages though. We have people wanting to build from source themselves without conda and can't necessarily guarantee a new enough compiler for C++17. We've already had pushback for C++14 πŸ˜„.

jrhemstad commented 4 years ago

Agreed host memory pool is future work, just wanted to bring it up since we're talking about adding host memory management to RMM.

jakirkham commented 4 years ago

Ok makes sense. Thanks for the context and thanks for putting up with suggestions you likely have already considered πŸ™‚

harrism commented 4 years ago

Can this issue be closed now that host_memory_resource exists and rmm::alloc/free are going to be dropped?

jrhemstad commented 4 years ago

We still need a host_buffer analog to device_buffer.

harrism commented 4 years ago

Ah yes, I got that confused.

github-actions[bot] commented 3 years ago

This issue has been marked rotten due to no recent activity in the past 90d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

github-actions[bot] commented 3 years ago

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

randerzander commented 3 years ago

Still desired

jakirkham commented 3 years ago

(for context this came up today when discussing how to improve spilling and serialization performance)

cc @quasiben (for awareness)

jrhemstad commented 3 years ago

There's unlikely to be any movement here until https://github.com/NVIDIA/libcudacxx/pull/158 is complete in the next month or so.

jakirkham commented 3 years ago

Thanks for the update Jake πŸ™‚

jrhemstad commented 6 months ago

This conversation is quite stale now, so I'll give a quick update on the status quo:

harrism commented 6 months ago

Would you distinguish between async and synchronous host_buffer and device_buffer classes (so there would be 4 classes)? Or would you have two classes with both sync and async methods? Or would host_buffer always be synchronous and device_buffer always async?