inducer / pycuda

CUDA integration for Python, plus shiny features
http://mathema.tician.de/software/pycuda
Other
1.82k stars 286 forks source link

broadcasting for distance calculation #196

Open kylemcdonald opened 5 years ago

kylemcdonald commented 5 years ago

I would like to compute squared euclidean distance between all (m x n) combinations of two lists with length m and n using pycuda. Here is some numpy-compatible code:

def sqdist(a, b):
    m = len(a)
    n = len(b)
    diff = a.reshape(m,1,2) - b.reshape(1,n,2)
    diff **= 2
    return diff.sum(axis=2)

This doesn't work with pycuda, so I wrote a similar chunk of test code using pycuda:

import numpy as np
import pycuda.gpuarray as gpuarray
import pycuda.autoinit
a = np.random.random((10, 1, 2)).astype(np.float32)
b = np.random.random((1, 10, 2)).astype(np.float32)
a_gpu = gpuarray.to_gpu(a)
b_gpu = gpuarray.to_gpu(b)
out = a_gpu - b_gpu

The error in both cases is as follows:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-2-22a0d9afa207> in <module>()
     12 a_gpu = gpuarray.to_gpu(a)
     13 b_gpu = gpuarray.to_gpu(b)
---> 14 out = a_gpu - b_gpu
     15 
     16 # sqdist(a_gpu, b_gpu).shape

~/anaconda3/lib/python3.6/site-packages/pycuda/gpuarray.py in __sub__(self, other)
    449         if isinstance(other, GPUArray):
    450             result = self._new_like_me(_get_common_dtype(self, other))
--> 451             return self._axpbyz(1, other, -1, result)
    452         else:
    453             if other == 0:

~/anaconda3/lib/python3.6/site-packages/pycuda/gpuarray.py in _axpbyz(self, selffac, other, otherfac, out, add_timer, stream)
    331         """Compute ``out = selffac * self + otherfac*other``,
    332         where `other` is a vector.."""
--> 333         assert self.shape == other.shape
    334         if not self.flags.forc or not other.flags.forc:
    335             raise RuntimeError("only contiguous arrays may "

AssertionError: 

Does pycuda not support broadcasting? I checked the FAQ and searched the documentation for "broadcasting" and couldn't find any clear statement on this. Or is it necessary to write a custom kernel to handle this? Thank you :)

inducer commented 5 years ago

I'd be happy to take a PR to realize this.

zzjjbb commented 3 years ago

I just write a package to hack the code in Elementwise to support basic broadcasting and also extend GPUArray https://github.com/zzjjbb/broadcasting_pycuda Currently, it's experimental and I don't have time to merge it in pycuda. If you want to use some other functions, you can just copy & paste that part from pycuda. Maybe also need to modify it a little by yourself. Use at your own risk...