Conflict with GPUArray get() function and overloaded Array get() function for CUDA api during asynchronous copies

Hey Bogdan,

If I use the CUDA api and try to asynchronously copy memory back to the host from the device I get an error saying that the get function doesn't have the ary keyword argument. Here is some code that reproduces the problem.

# Import things
import pycuda.driver as drv
import pycuda.autoinit

import numpy as np
import reikna.cluda as cluda

shape=(1024,1024)
float_type=np.float32

api = cluda.cuda_api()
thr = api.Thread.create()

# Create a Reikna array
array_d=thr.array(shape, dtype=float_type)

# Create a pagelocked (pinned) Numpy array for the asynchronous copy
array_h=drv.pagelocked_empty(shape, float_type)

# Try two methods to asynchronously copy memory to the pinned Numpy array
thr.from_device(array_d, dest=array_h, async_=True)
array_d.get_async(ary=array_h)

# Synchronize memory
thr.synchronize()

The error returned is

  File "reikna_array.py", line 21, in <module>
    thr.from_device(array_d, dest=array_h, async_=True)
  File "/path_to_python/lib/python3.7/site-packages/reikna/cluda/cuda.py", line 200, in from_device
    arr_cpu = arr.get_async(stream=self._queue, ary=dest)
  File "/path_to_python/lib/python3.7/site-packages/pycuda/gpuarray.py", line 311, in get_async
    return self.get(ary=ary, async_=True, stream=stream)
TypeError: get() got an unexpected keyword argument 'ary'

I suspect that the problem arises is because the get_async function in the PyCuda GPUArray class is calling a get function which is redefined as something else in the Reikna Array superclass which derives from GPUArray.

fjarri / reikna

Conflict with GPUArray get() function and overloaded Array get() function for CUDA api during asynchronous copies #55