Closed zachjweiner closed 1 year ago
It seems I managed to duplicate one of my own issues, #505 🤦 Somehow I had residual usage of __iadd__
with unit-sized arrays, which didn't have the isscalar
check added until #620. I guess the conversation can continue in #505.
620 (11100df4f3d8a01b85d1667a3cc344176a98f814) causes operations between CL arrays and unit-sized numpy arrays to fall back to numpy to perform the loop. For example:
yields, e.g.,
Namely, each element of the CL array is added with the unit-sized numpy array one by one, which of course is superbly slow. The result turns
ary
into a numpy array of CL scalar arrays!Before, unit-sized arrays were simply freely passed along to kernels. (I can't see precisely why that used to work, but it did!) I wouldn't think it's desirable to ever fall back on numpy to perform the loops - I would think incompatible operations should fail ungracefully, i.e., for array operations that would be valid if both operands were numpy arrays or CL arrays but crash at kernel invocation when trying to pass a numpy array pointer.