MikeMcl / decimal.js

An arbitrary-precision Decimal type for JavaScript
http://mikemcl.github.io/decimal.js
MIT License
6.35k stars 480 forks source link

Compute acos with less cancellation near x=1 #217

Open AldenMB opened 1 year ago

AldenMB commented 1 year ago

Previously decimal.js used the formula acos(x) = pi/2 - asin(x), which has significant cancellation near x=1. I propose that instead we use the formula acos(x)=2*atan(sqrt( (1-x)/(1+x) )), which has less cancellation.

In addition, I provide two test cases where the old formula gives an incorrect digit, and where the new formula gives the correct answer.

I found these test cases using the Python tool hypothesis, comparing against the Python library mpmath. I independently verified the test cases against Mathematica. I have also included the code used to generate those cases, in /test/hypothesis. This may be useful in the future to generate more test cases.

MikeMcl commented 1 year ago

Interesting. I'll take a look.

I wonder if the new formula is faster or slower?

I wonder if there are any instances where the old formula is correct and the new formula is off by one unit in the last place (as the old formula is in your two new tests)?

Can you provide any further evidence that the new formula is less prone to cancellation?

AldenMB commented 1 year ago

I will gladly respond to these questions. Here are my best answers at the moment:

I wonder if the new formula is faster or slower?

I would expect this method to be about the same, possibly slightly faster. It has an add, a subtract, a divide, a square root, an arctangent, and an integer multiplication. The old method, using arcsine, uses one more subtract and one more multipliction since it uses the formula asin(x) = 2*atan( x / (1+sqrt(1-x**2)) ). The asin computation happens at a twice-increased precision as well, since acos an asin both increase their precision by 6 for intermediate values. The difference should be quite small, though.

Since you asked, I compared how each does on my machine (just the version of firefox I am using, 109.0.1 I suppose). I used each version of the code for 5 trials, doing the full set of 979 tests which both pass, including those commented out in the distributed version of the module but not including the two new cases. The old version took 3147, 3148, 3176, 3188, and 3236 milliseconds. The new version took 3086, 3088, 3119, 3123, and 3145 milliseconds. So, comparing the medians we find a typical improvement of 57 milliseconds for 979 tests. Not much compared to the variation from one trial to the next, but there is definitely a noticeable speed improvement.

I wonder if there are any instances where the old formula is correct and the new formula is off by one unit in the last place (as the old formula is in your two new tests)?

I am not sure. I have had Hypothesis running for a few hours searching for any cases where the new formula gives an incorrect digit and I have yet to find one, although I have found a similar off-by-one error in the sine function, which I have yet to explain. Currently I only have my tool configured to search with one precision setting and one rounding mode, so it is reasonable to suspect there may be errors in other rounding modes. The new formula does pass all the old tests, however. If this introduces a regression then it is not a documented regression.

Perhaps it will be fruitful to expand this tool to search for errors for other configurations of rounding mode and precision as well. This should be fairly straightforward to set up, though it will take some time.

Can you provide any further evidence that the new formula is less prone to cancellation?

The cancellation in the old method happens mostly at the last step, subtracting pi/2 from asin(x). This subtraction happens at the increased working precision of pr+6, so we can never get an answer with an error less than 10**-(pr+6).

The following example illustrates why this is a problem. If x=1-e for some error e, the Taylor approximation to cosine gives us a resonable estimate of acos(x) = sqrt(2*e). For convenience, say e=5*10**-(2k+1) for some k. To make sure x is representable we would naturally have 2k+1<=pr. With these values, Then acos(x) will be about 10**-k. Assuming asin is correct, i.e. only giving rounding error at the last step, that still leaves us with a relative error in our answer of 10**-(pr+6-k). That is, we only have pr+6-k digits of precision. Because of 2k+1<=pr we can bound this below by 6+(pr+1)/2, but that's about it. By taking 2k+1=pr we end up losing (pr-13)/2 digits of precision. The example given is quite adversarial, i.e. not likely to happen "in the wild", but it illustrates a pattern which applies in general near x=1.

In contrast, the new method gets all cancellation over with early, entirely in the 1-x step. This is not even as bad as it may seem. Because we are working at slightly increased precision, we can actually represent 1-x and 1+x exactly, with no error. Note we are assuming the input is exact, which would not be valid if we were doing interval arithmetic. From then on, we get some error from doing the division, square root, and arctangent, but these errors are much more controlled, and not subject to the loss of significance from cancellation. Really, the only thing we sacrifice is a little bit of precision near x=0. This is no trouble for us, because acos(0)=pi/2 is not close to zero, so a bit of absolute error still gives small relative error.

To make things more explicit, I have added the following test cases based on the example I gave above. I will add these to the pull request momentarily.

acos(x) computed by...    | pr=25, x=0.9999999999999999995      | pr=30, x=0.99999999999999999999995 
--------------------------|-------------------------------------|-------------------------------------------
...old method (incorrect) | 0.000000001000000000000000000077    | 0.00000000001000000000000000000000023
...new method (correct)   | 0.000000001000000000000000000041667 | 0.0000000000100000000000000000000000416667
AldenMB commented 1 year ago

For those interested: you can make the example given more rigorous by noting acos(1-x**2/2) = x + x**3/24 + O(x**5) and taking x=10**-k. This also explains the factor of 1/24=0.0416666 which appears in the examples. You could prove that identity by taking the derivative, using a binomial series, and integrating term by term.