Open b-sumner opened 3 months ago
The new correctly rounded divide test for half precision
Which reminds me, we should probably disable the correctly rounded tests for fp16 and fp64; see #1901 .
However, for fp16 I believe the divide_cr
and divide
tests are identical, since fp16 division must be correctly rounded (at least for the full profile). So if I'm not mistaken this issue also affects the regular fp16 divide
test?
When running the new fp16 divide test I get this kind of error:
ERROR: divide: 0.500000 ulp error at {0x1.428p-13 (0x090a), -0x1.ep+6 (0xd780)}
Expected: -0x1.6p-20 (half 0x8016)
Actual: -0x1.5p-20 (half 0x8015) at index: 61714
As the specification states that fp16 precision for divide is <= 1 ulp
, I'm not sure if this is a real error, or a bug somewhere in the CTS.
Alright, it feels like the the half_ulp is set to
My mistake, I was looking at the embedded profile. For the full profile, the division needs to be correctly rounded.0.0f
here. I'll open another issue and will make a PR for it.
~Alright, it feels like the the half_ulp is set to
0.0f
here. I'll open another issue and will make a PR for it.~ My mistake, I was looking at the embedded profile. For the full profile, the division needs to be correctly rounded.
For the embedded profile we need to account for the <= 1 ulp
, so we'll still need a separate issue I think (edit: we actually have #1685 for this already).
@b-sumner are you seeing any failures related to the excess roundings? If so, could you share some of the problematic input values?
I haven't studied it closely, but this suggests some theorems for safe double rounding that could be helpful: https://dl.acm.org/doi/abs/10.1145/221332.221334
I thought I saw another patch about dropping the correctly rounded tests for non-fp32 floating point types, so perhaps this should be closed.
But while I'm here I want to ask a related question about the max ulp error for half division. What I see in function_list.cpp is:
412 { "divide", 413 "/", 414 { (void)reference_divide }, 415 { (void)reference_dividel }, 416 { (void*)reference_relaxed_divide }, 417 2.5f, 418 0.0f, 419 0.0f, 420 3.0f, 421 2.5f, 422 INFINITY, 423 FTZ_OFF, 424 RELAXED_ON, 425 binaryOperatorF },
Line 419 sets the half_ulps to 0.0. I would like to know where in the spec this requirement of correct rounding for half division appears or if the specified error is higher but simply not properly reflected in the test.
I would like to know where in the spec this requirement of correct rounding for half division appears
Table 69 in https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html#relative-error-as-ulps
Thank you.
The new correctly rounded divide test for half precision, located in binary_operator_half.cpp is using an fptr for its reference function and computing the reference like this:
Here func.f_ff works out to reference_divide(). So r[j] starts with the double precision rounded result of the divide, rounds it to single precision and then rounds that to half. That's 3 roundings instead of the required single rounding.
Shouldn't this test be disabled until a correct reference is used?