Open dhermes opened 7 years ago
Heart of the problem:
(s^2 + 60 s) / 512 = 3 / 128 = '0x1.8000000000000 p-6'
= '0x0.030000000000000p+1'
1076 / 512 = 269 / 128 = '0x1.0d00000000000 p+1'
for example
s = s* + 2^(-55)
= '0x1.983e62b67adefp-3'
==> s (s + 60) / 512 = 3/128 + 2^(-58)
= '0x1.8000000000001 p-6'
= '0x0.030000000000002p+1'
----
s = s* + 2^(-54)
= '0x1.983e62b67adf0p-3'
==> s (s + 60) / 512 = 3/128 + 2^(-57)
= '0x1.8000000000002 p-6'
= '0x0.030000000000004p+1'
So anything that differs in the last 6-7 bits from the true value of s
ends up contributing to bits of (s^2 + 60 s) / 512
that just get dropped.
If we replace s
with S - 30
then we end up with
(S^2 + 176) / 512
and literally only '0x1.e3307cc56cf5cp+4'
(S = 30.199337741083
) makes that expression equal to 2.125
. This is because we have less wiggle room between the values:
176 / 512 = 11 / 32 = 0.34375 = '0x1.6000000000000 p-2'
'0x0.58000000000000p+0'
S^2 / 512 = 57 / 32 = 1.78125 = '0x1.c800000000000 p+0'
and the non-constant part has the bigger exponent, meaning the "wrong" bits don't get dropped.
Post-script:
This S
isn't that helpful, we can shift back to s = S - 30
(even without round-off), but we end up with the wrong answer:
>>> S = float.fromhex('0x1.e3307cc56cf5cp+4')
>>> s = S - 30.0
>>> s.hex()
'0x1.983e62b67ae00p-3'
>>> from sympy import Rational as R
>>> R(S) - 30 == R(s) # Check that there is no round-off in `s`
True
It is just "missing" the last 8 bits / 2 hex digits, which corresponds to an 18 ULP error:
>>> import numpy as np
>>> expected_s = float.fromhex('0x1.983e62b67adeep-3')
>>> (s - expected_s) / np.spacing(s)
18.0
>>> 0xe00 - 0xdee
18
Starting with our target value, we want to find the "exact" interval which must round to that value:
>>> import numpy as np # 1.12.0
>>> import sympy # 1.0
>>>
>>> correct_output = 2.125
>>> output_spacing = sympy.Rational(np.spacing(correct_output))
>>> correct_output = sympy.Rational(correct_output)
>>>
>>> half_left = correct_output - output_spacing / 2
>>> half_right = correct_output + output_spacing / 2
These values are half a ULP away from the desired output. Then we can evaluate our polynomial exactly (using sympy.Rational
objects):
>>> def poly_eval(val):
... return ((val + 60) * val + 1076) / 512
...
We want to consider exact floating point values nearby our expected t
:
>>> correct_input = float.fromhex('0x1.983e62b67adeep-3')
>>> input_spacing = sympy.Rational(np.spacing(correct_input))
>>> correct_input = sympy.Rational(correct_input)
Considering 200 ULPs around the expected t
, we track the ones that end up in our interval that must round to 2.125
>>> import six # 1.10.0
>>>
>>> accepted = []
>>> for delta in six.moves.xrange(-200, 200 + 1):
... curr_input = correct_input + delta * input_spacing
... curr_output = poly_eval(curr_input)
... if half_left <= curr_output <= half_right:
... accepted.append(delta)
...
It turns out, this amounts to the interval of values 67 ULPs (in either direction) from the expected t
>>> accepted == list(six.moves.xrange(-67, 67 + 1))
True
The "best" answer is not exactly correct, but is within less than 0.05% of a ULP residual:
>>> error = poly_eval(correct_input) - correct_output
>>> float(error / output_spacing)
-0.0004519444865033115
As an aside, a method to find a valid parameterization is described in a technical report of Manocha and Canny. (Two other reports are also useful, citation 1, citation 2, citation 3)
The technical report also lists a paper of Sederberg
e.g. with curves, this leads to larger than expected loss of accuracy when doing intersections. As a (somewhat contrived) example encountered in the wild:
This comes from the bad parameterization of
curve2
:which should instead be
Same root, but non-"bad" parameterization
However, moving the
y
-values so thatcurve2
is no longer badly parameterized doesn't fix things:NOTE:
curve3
lies on the algebraic curve given byx = (256 y^2 + 480 y + 929) / 2048
, so the quadratic parameterization is appropriate.Newton's Method
What happens if we keep going with Newton's method on the "bad" intersection:
Newton's method terminates because there is a fairly large band where
B1(s) = B2(t)
numerically (with parameterizations):This is due to the relative size mismatch in the coefficients of
t^2 + 60 t + 1076
. Centering a band of 401 ULPs aroundexpected_t
(200 on either side), we find that>= 135
of them evaluatex2(t)
to exactly2.125
using 5 different methods. Using the Bernstein basis and de Casteljau algorithm, 196 of them evaluate to2.125
. Using Horner's method, an entire contiguous band of 68 ULPs on either side ofexpected_t
evaluate to exactly2.125
.In particular:
And is it the same for the non-"bad" one:
Since
x3(t) = x2(t)
andy3(t) = (2 t + 45) / 16 = y1(t)
, onces = t
, we are subject to the same issues as in the intersection ofcurve1
andcurve2
."Exact" values
However, this may just be an issue with polynomials? Or Newton's method? Computing the roots of
s^2 + 60 s - 12
via the quadratic formula even yields a decent amount of error (unless cancellation is avoided):Algebraic
Though the algebraic approach doesn't (necessarily) suffer from the same issue: