Open fcharras opened 2 years ago
Minimal reproducer using latest syntax:
def test_minimal():
N = 10
device = dpctl.select_default_device()
@dpx.kernel
def func(a):
i = dpx.get_global_id(0)
i = math.ceil(i + 0.5) - 1
a[i] = a[i] + 1
a = dpnp.ones(N, dtype=dpnp.float32, device=device)
func[dpx.Range(N)](a)
print(a.asnumpy())
@fcharras The issue here is that the math.ceil
and math.floor
functions are replaced by the SYCL equivalents that only support floating point values. We are looking at a solution where the code generator will add a cast to make the function behave the same way as Python math function.
Just to add the fact that all c/c++
implementations out there do float --> float
, double --> double
for ceil()
and floor()
, including SYCL and openCL.
(Sorry for the lack of feedback this week, I took some off time.) Practically speaking, I wouldn't say this issue is too bothersome, it's more a matter of clarity. Python users will refer to python documentation of the math module or will open an interpreter and check the output type and expect it to behave the same in a kernel.
Maybe it's fine to keep SYCL-like behavior, if the documentation gives the differences with the python implementation. Or maybe instead of reusing the math
namespace, it could be exposed directly in a numba_dpex.math
module ? Users will be less likely (and less right to do so) to assume that dpex.math.floor
calls within a kernel should behave like math.floor
behaves in cpython.
Hi @fcharras could you please test this branch https://github.com/chudur-budur/numba-dpex/tree/github-759 and verify if they are returning int
s? I think this issue is fixed in #960.
For me, #960 indeed fixes the issue.
Just want to add that I realized this mistake I made in the OP: when saying
it returns a
float64
I didn't realize that the output type depends on the input type and thought it would always invoke float64
compute and thus crash gpus that do not have float64
aspect. In fact it seems that math.ceil(x)
does anyway work fine on those gpus if x
is float32
or int32
, so casting both beforehand and afterhand (to int
) would have solved this issue.
math.ceil
andmath.floor
are supported within a numba kernel and when using it within aCPython
interpreter it returns anint
.But within a
dpex.kernel
it returns afloat64
, which can be the source of several issues:math
functions cannot be used to index adpex.local.array
float64
supportHere is a toy example to reproduce the issue:
It fails both on Intel devcloud or a custom environment with more up to date versions