NVIDIA / grcuda

Polyglot CUDA integration for the GraalVM
Other
222 stars 19 forks source link

Add possibility to copy data form off-heap memory pointer into DeviceArray and vice versa #12

Closed muellren closed 5 years ago

muellren commented 5 years ago

This issue provides programmers that use native code (e.g., Python programmers using NumPy) a backdoor to exchange data with DeviceArrays.

Example Use Case

A NumPy array is created in a Python script by some ML package. It content should then be copied into a DeviceArray, processed by a kernel, and the result written back to the NumPy array for further processing.

import numpy as np
import polyglot

# 1 dimensional NumPy and device arrays 
arr = np.array([42,2,3,4,5], dtype=np.int32)
dev_arr = polyglot.eval(language='grcuda', string='int[5]')

# copy content from NumPy array to DeviceArray 
dev_arr.copyFrom(int(np.int64(arr.ctypes.data)), len(arr))

# process device array in-place by GPU kernel
source = """__global__
void incr(int *arr, int n) {
  int idx = threadIdx.x + blockIdx.x * blockDim.x;
  if (idx < n) {
    arr[idx] += 1;
  }
}
"""
buildkernel = polyglot.eval(language='grcuda', string='buildkernel')
kernel = buildkernel(source, 'incr', 'pointer, sint32')
kernel(20, 256)(dev_arr, len(arr))

# copy DeviceArray back to NumPy array
dev_arr.copyTo(int(np.int64(arr.ctypes.data)), len(arr))
print(arr)

Note: accessing raw pointers is inherently unsafe. Invalid pointers that are passed to grCUDA lead to undefined behavior, e.g., a crash of the VM. This is a stopgap solution until there exists a save way to exchange pointers to native array used by the different GraalVM languages, e.g., through higher-level abstractions.

muellren commented 5 years ago

This temporarily addresses issue #6.

muellren commented 5 years ago

Pull request #13 provides a solution.

muellren commented 5 years ago

closed as resolved