inducer / pyopencl

OpenCL integration for Python, plus shiny features
http://mathema.tician.de/software/pyopencl
Other
1.04k stars 238 forks source link

New np.isscalar checks in array arithmetic break operations with unit-length arrays #663

Closed zachjweiner closed 1 year ago

zachjweiner commented 1 year ago

620 (11100df4f3d8a01b85d1667a3cc344176a98f814) causes operations between CL arrays and unit-sized numpy arrays to fall back to numpy to perform the loop. For example:

import numpy as np
import pyopencl as cl
import pyopencl.array as cla

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
ary = cla.zeros(queue, (10**5,), "float64")

ary += 1.  # first run

from time import time

s = time()
ary += 1.
e = time()
print(e-s)

s = time()
ary += np.array([1.])
e = time()
print(e-s)
print(type(ary), type(ary[0]))

yields, e.g.,

5.054473876953125e-05
7.800684928894043
<class 'numpy.ndarray'> <class 'pyopencl.array.Array'>

Namely, each element of the CL array is added with the unit-sized numpy array one by one, which of course is superbly slow. The result turns ary into a numpy array of CL scalar arrays!

Before, unit-sized arrays were simply freely passed along to kernels. (I can't see precisely why that used to work, but it did!) I wouldn't think it's desirable to ever fall back on numpy to perform the loops - I would think incompatible operations should fail ungracefully, i.e., for array operations that would be valid if both operands were numpy arrays or CL arrays but crash at kernel invocation when trying to pass a numpy array pointer.

zachjweiner commented 1 year ago

It seems I managed to duplicate one of my own issues, #505 🤦 Somehow I had residual usage of __iadd__ with unit-sized arrays, which didn't have the isscalar check added until #620. I guess the conversation can continue in #505.