kokkos / pykokkos-base

Python bindings for data interoperability with Kokkos (View, DynRankView)
Other
26 stars 9 forks source link

Cupy/Numpy arrays for CudaSpace views #3

Closed NaderAlAwar closed 3 years ago

NaderAlAwar commented 4 years ago

Is there a way to get a cupy or numpy array from a CudaSpace view? This code results in a segfault when I try to create the cupy array. I also tried variations with numpy arrays and copy=True, but that didn't work either.

import cupy as cp
import kokkos

def test():
    view = kokkos.array("view", [5], dtype=kokkos.double, space=kokkos.CudaSpace)
    arr = cp.array(view, copy=False)

if __name__ == "__main__":
    kokkos.initialize()
    test()
    kokkos.finalize()

Python version: 3.8.3 Numpy version: 1.18.5 Cupy version: 7.8.0 CUDA version: 11.0 I also built the project with -DKokkos_ENABLE_CUDA=ON

jrmadsen commented 4 years ago

Repost from Slack discussion:

So the only thing that does not work is:

    view = kokkos.array("view", [5], dtype=kokkos.double, space=kokkos.CudaSpace)
    arr = cp.array(view, copy=False)

because the constructor isn’t initializing the memory. This works:

    view = kokkos.array("view", [5], dtype=kokkos.double, space=kokkos.CudaUVMSpace)
    arr = cp.array(view, copy=False)

Or if you create Kokkos::View<double**, Kokkos::CudaSpace> in C++ code and initialize the memory, e.g.

struct InitView {
  explicit InitView(view_type _v) : m_view(_v) {}
  KOKKOS_INLINE_FUNCTION
  void operator()(const int i) const {
    m_view(i, i % 2) = i;
  }
 private:
  view_type m_view;
};
view_type generate_view(size_t n) {
  view_type _v("user_view", n, 2);
  Kokkos::RangePolicy<Kokkos::Cuda, int> range(0, n);
  Kokkos::parallel_for("generate_view", range, InitView{_v});
  return _v;
}

and create a binding to generate_view then this works fine:

    view = generate_view(args.ndim)
    print("Kokkos View : {} (shape={})".format(type(view).__name__,
          view.shape))
    arr = np.array(view, copy=False)
    print("Numpy Array : {} (shape={})".format(type(arr).__name__, arr.shape))

Your test case is basically something that (I don’t think) anybody would ever really do in a real-life scenario, e.g. create a non-UVM CUDA allocation, do nothing with it, and then convert it to numpy/cupy. But in the other cases where you actually do something with it in C++ and return it to python, or create a UVM allocation and do something with it in Python before copying it over to numpy/cupy, etc. — those are working.

But I will get that test case working, I just hadn’t ever thought to test out that scenario.