rapidsai / raft

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.
https://docs.rapids.ai/api/raft/stable/
Apache License 2.0
699 stars 183 forks source link

[FEA] Introduce `universal_mdspan/mdarray` vocabulary to RAFT #1819

Open divyegala opened 10 months ago

divyegala commented 10 months ago

Working on https://github.com/rapidsai/raft/pull/1748, I realized the need to introduce types that allow for memory access from host and device both, as more algorithms need to support allocating either managed or pinned memory.

How I came up with an RAII solution to allocating pinned host memory:

using pinned_memory_resource = thrust::universal_host_pinned_memory_resource;
template <typename T>
using pinned_memory_allocator = thrust::mr::stateless_resource_allocator<T, pinned_memory_resource>;

thrust::host_vector<int, pinned_memory_allocator<int>> vec;

RMM

  1. Issue with using an RMM based allocator (rmm/mr/host/pinned_memory_resource.hpp): -thrust allocates elements by keeping track of a template type T for the element to be allocated whereas RMM allocates bytes directly

    • RMM has thrust_allocator_adaptor.h which accounts for the above conversion, but unfortunately derives from thrust::device_malloc_allocator
    • Furthermore, rmm::device_uvector only allows an allocator derived from a device_memory_resource
  2. Fix

    • Modify thrust_allocator_adaptor.h to allow rmm::thrust_allocator for inheriting from any arbitrary thrust::allocator and using it with thrust::host_vector
  3. Consequences

    • Still relying on thrust::host_vector when RAFT should be able to natively support these solutions

RAFT

  1. Why a universal_* vocabulary makes the most sense

    • The standard for https://en.cppreference.com/w/cpp/container/vector defines the type for allocator to be similar to thrust's type based allocator of the form std::allocator<ElementType
    • In raft::host_mdarray we ourselves use std::allocator<ElementType> for allocations with std::vector under the hood
  2. Fix

    • Adding a pinned_memory_allocator and managed_memory_allocator to RAFT does not make much sense in terms of using simple factories to create RAFT types
    • Keeping in spirit with the current design, adding a pinned_container_policy and managed_container_policy in raft/core/universal_container_policy.hpp and creating equivalent universal_mdspan/mdarray types gives us the most flexibility and removes confusion for users and devs of having to work with allocators and memory resources
  3. Consequences

    • Simple factories raft::make_pinned_host_mdarray and raft::make_managed_mdarray in universal_mdarray.hpp
    • universal_mdarray::view() produces universal_mdspan
    • Factories raft::make_host_mdspan(universal_mdspan) and raft::make_device_mdspan(universal_mdspan) to let dev allow the type to go through generic host/device paths when needed

Conclusion We'll need updates to both RMM and RAFT as discussed above, but it will ultimately end up giving us more flexibility while letting us define an increasingly important type.

wphicks commented 10 months ago

The mdbuffer PR already includes managed and pinned mdarray support. We needed a couple workarounds for the missing RMM functionality you mentioned, but it's there. Depending on whether the mdbuffer work gets deprioritized again or not, I can split it off into a separate PR as I did for the mdspan copy utility. We can follow up with cleaner versions once those changes are in RMM.

divyegala commented 10 months ago

@wphicks did you include new verbiage for managed and pinned mdarray support in the mdbuffer PR like this issue mentions or are they a part of host_mdarray?

Please keep me in the loop when mdbuffer is higher priority on your task-list. I'd like to help you avoid any workarounds by adding/modifying code upstream to RMM

wphicks commented 9 months ago

I added new verbiage. We now have managed_mdarray and pinned_mdarray (though I'm not married to those names). I'll go ahead and post the mdbuffer draft soon and link to the specific implementation details. It would be awesome if we could avoid the current workarounds for sure! I'll touch base with you on details ASAP.

tfeher commented 8 months ago

Linking disscussion https://github.com/rapidsai/raft/pull/1896#discussion_r1387814866