Open antonwolfy opened 6 months ago
The rounding mode is not exactly at fault here. Per array API
Rounds the result of dividing each element x1_i of the input array x1 by the respective element x2_i of the input array x2 to the greatest (i.e., closest to +infinity) integer-value number that is not greater than the division result
In this case, 1.0 > 0.99999994
, so 0.0
is the appropriate result. So the behavior checks out per array API. The surprising result is caused by the division itself being inaccurate, possibly due to lower precision on GPU devices.
@ndgrigorian, now I see, thank you for the clarification. Would it be worst then to have a special handling in the code? something like
if (sycl::fmod(in1, in2) == 0) {
return resT(std::rint(in1/in2);
}
In below example the behavior is different between CPU and GPU devices:
So we have
0
as 7th element of the result array for GPU device and1
on CPU and in numpy.If we look into
divide
function output:there will be the value
0.99999994 < 1.
for GPU device. Based on the code:dpctl uses
sycl::floor()
function, which is intended to returnAnd I guess this is the reason why
0.99999994
rounds to0
here. While in Python array API it states that:Thus I wonder if it is expected dpctl behavior or an issue.