rapidsai / rmm

RAPIDS Memory Manager
https://docs.rapids.ai/api/rmm/stable/
Apache License 2.0
452 stars 190 forks source link

[FEA] Python bindings for `rmm::mr::pinned_host_memory_resource` #1429

Open harrism opened 6 months ago

harrism commented 6 months ago

Is your feature request related to a problem? Please describe.

Add Python bindings for pinned_host_memory_resource added in #1392 .

willtryagain commented 6 months ago

Hi, I would like to work on this issue. But I don't fully understand how to solve this. I am going through the cython documentation. I have also located the files I need to change - https://github.com/rapidsai/rmm/blob/branch-24.02/python/rmm/_lib/memory_resource.pyx

Would you mind sharing the steps to take to do so? Here is my (maybe incorrect) understanding so far -

  1. Create a class definition like https://github.com/rapidsai/rmm/blob/branch-24.02/python/rmm/_lib/memory_resource.pyx#L126
  2. Create python class PinnedHostMemoryResource and define the functions inside
  3. Add entry to python/rmm/mr.py
  4. Add tests to python/rmm/tests/test_rmm.py
harrism commented 6 months ago

@willtryagain that sounds about right! Thanks and we look forward to your contribution. Please make sure you also look at the RAPIDS contributing docs. Also this page is useful for dates; we are just starting 24.04 so make sure you target branch-24.04 with your PR.

willtryagain commented 5 months ago

Hey @harrism ,

Can you direct me on how to debug this error please?

I got this when I ran

cmake .. -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX.

I was busy with other deadlines so sorry for the late response.

wence- commented 5 months ago

Hey @harrism ,

Can you direct me on how to debug this error please?

I got this when I ran

cmake .. -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX.

I was busy with other deadlines so sorry for the late response.

Something looks very strange with your environment from that error message. What did you do to set up the development environment for RMM?

willtryagain commented 5 months ago

Hello @wence- ,

I have followed the README instructions

  1. git clone the latest version
  2. create conda env usingconda env create --name rmm_dev --file conda/environments/all_cuda-122_arch-x86_64.yaml. I have nvcc 12.3 so this may be a problem.
  3. follow build instructions
    $ mkdir build # make a build directory $ cd build # enter the build directory $ cmake .. -DCMAKE_INSTALL_PREFIX=/install/path # configure cmake ... use $CONDA_PREFIX if you're using Anaconda

Thanks.

wence- commented 5 months ago

OK, thanks. I have a CTK 12.3 install here, so maybe I can reproduce.

wence- commented 5 months ago

Hmm, I could not reproduce, I did (I used mamba not conda, but everything else is the same). On ce3af2c46b8b:

git clean -fdx # there's no magic behind the curtain
mamba env create --name rmm-dev --file conda/environments/all_cuda-122_arch-x86_64.yaml
mamba activate rmm-dev
mkdir build-test
cd build-test
cmake .. -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX
make -j

Without any issues.

Can you show the output (in the activated rmm_dev environment) of conda list?

An alternative to trying to get the development environment set up manually is (easiest if you use vscode, but possible without) to use the devcontainer setup we have in the repository (needs some setup in advance setting up the nvidia container toolkit locally, https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).

willtryagain commented 5 months ago

Hey @wence- ,

Thanks. I also wanted to inform that while running cmake it failed CMAKE_HAVE_LIBC_PTHREAD.

Here is the conda list output.

wence- commented 5 months ago

Hmm, that is identical to mine, so I am somewhat at a loss as to what broke

harrism commented 4 months ago

Highly recommend using dev containers to avoid problems.

willtryagain commented 4 months ago

Hi guys, I was able to build the library after switching to a different OS. It passed all the tests. Can you tell what methods to include in the PinnedHostMemoryResource class? I am guessing allocate and deallocate.

Also, for including the cpp class is this correct

cdef extern from "rmm/mr/pinned_host_memory_resource.hpp"  \
        namespace "rmm::mr" nogil:
  cdef cppclass pinned_host_memory_resource:
         pinned_host_memory_resource() except +

Thanks

harrism commented 4 months ago

This is probably a bit harder than I originally thought, because pinned_host_memory_resource is not a device_memory_resource. It's the first example of an MR that implements the cuda::mr::memory_resource concept. Therefore there will need to be new Cython bindings added.

On further thought perhaps this is not a good first issue, so I'm removing that label. Still up to you if you want to try @willtryagain, but for me to explain it I might have to figure out how to write the code. :)

willtryagain commented 4 months ago

Ok @harrism Can you give me pointers to resources that can help me understand what to do? I am comfortable in C++ and Python but not much familiar with cython bindings. I would study more about cython.

harrism commented 4 months ago

As with the extern cdef cppclass classes in memory_resource.pyx, I think you would need an extern cppclass for host_pinned_memory_resource, and then a cdef class HostPinnedMemoryResource that calls its functions. The analogy would be device_memory_resource and DeviceMemoryResource.

But there is a problem. Unlike all the other MRs in RMM Python, HostPinnedMemoryResource is not a DeviceMemoryResource. So we need to rethink our inheritance.