Bugfix inaccurate powl - Githubissues

Finii commented 9 months ago

[why] For to_number<long double>() to work we need a std::powl() that is accurate - its errors directly influence our conversion output, usually amplified. In the past we just knew about Apple clang that had a problem with std::powl() being off a bit, but now also Alpine has the same problem via musl that it uses.

[how] On parsing a string a possible number is expressed with scientific notification, for example 123.4E5 which means 123.4 * (10^5).

The mantissa is converted to a normalized value as floating point number, (1.234 in the example) and the exponent is corrected for the shift of decimal places (5 becomes 7). So the example is transformed to 1.234 * (10^7).

To calculate the exponent the call pow(10, 7) is used. Any inaccuracy in the calculation directly influences the result. For very big (or small) exponents musl is off by more than 1000 ULPs.

But in fact a much simpler function than pow() would be sufficient. The base is always 10 and the exponent is always integer. Usually pow() calculates the result with log2() and exp() under the hood. But given the constraints we can get away with a much simpler approach.

The new function pow10(integer) uses a table of precalculated results for the exponents -16 to 16. Values for the other (bigger/smaller) exponents are calculated with the right to left binary exponentiation. The table needed for the binary approach is reasonably small while giving good enough accuracy. Despite the fact that several multiplications are chained the result is usually still within 1 ULP to the correct result, only exponents that have almost all bits set (i.e. need a lot of multiplications (7 or more)) have an error of 2 ULP.

As we now have an exact enough pow() substitude the platform specific exceptions and fallback to std::strtold() (that needs allocations) can be removed.

The code assumes that

double is: IEEE 754 binary64 (11 bit exponent)
long double is:
- just the same as double or
- IEEE 754 Intel extended precision (15 bit exponent) or the same representation padded (on ARM)

If long double has exponents with not exactly 11 or 15 bits the code and/or tests can break (fail).

There is no additional correction step to get the error in ULP down; but we never guaranteed a better conversion than 3 ULP off. That still holds. With one test we need to allow 1 ULP error where before it was 0 ULPs.

(*) The assumption is that floating point literals are handled correctly; rounded to the nearest representation = 0 ULP off of the ideal result.

[note] Here some examples of reported std::powl() accuracy:

Ubuntu 23.10 (GLIBC 2.38)

powl(10, -4000) off by 0 ULP
powl(10, -2000) off by 0 ULP
powl(10, -1000) off by 0 ULP
powl(10, -500) off by 0 ULP
powl(10, -250) off by -1 ULP
powl(10, -100) off by 1 ULP
powl(10, -10) off by 0 ULP

Alipne 3.19.1 (musl 1.2.4)

powl(10, -4000) off by 26 ULP
powl(10, -2000) off by 11 ULP
powl(10, -1000) off by 5 ULP
powl(10, -500) off by 1 ULP
powl(10, -250) off by 1 ULP
powl(10, -100) off by 1 ULP
powl(10, -10) off by 0 ULP

Sonoma 14.3 on M2 (libSystem 1336.61.1)

powl(10, -250) off by 0 ULP
powl(10, -100) off by 0 ULP
powl(10, -10) off by 0 ULP

On Sonoma it seems double == long double.

Github MacOS CI machine (x86?) (libSystem 1319.0.0)

powl(10, -4000) off by more than 1000 ULP
powl(10, -2000) off by 830 ULP
powl(10, -1000) off by 396 ULP
powl(10, -500) off by 191 ULP
powl(10, -250) off by 133 ULP
powl(10, -100) off by 66 ULP
powl(10, -10) off by 6 ULP

which is unexpected as the double performance is better

pow(10, -250) off by 0 ULP
pow(10, -100) off by 0 ULP
pow(10, -10) off by 0 ULP

[note] Some useful links:

About floating point representations: https://en.wikipedia.org/wiki/Double-precision_floating-point_format https://en.wikipedia.org/wiki/Extended_precision

David M. Gay's strtod() (Expat) for example here: https://portal.ampl.com/~dmg/netlib/fp/dtoa.c

Rick Regan's Exploring Binary blog, for example https://www.exploringbinary.com/fast-path-decimal-to-floating-point-conversion/ https://www.exploringbinary.com/strtod-initial-decimal-to-floating-point-approximation/ https://www.exploringbinary.com/why-powers-of-ten-up-to-10-to-the-22-are-exact-as-doubles/

Fixes: #69

Finii commented 9 months ago

If I remember correctly other projects have for that exact reason their own limited pow()-like function that is exact (not even off by 1 ULP), for example Ryu and the more modern variants. We could in principle also use that approach, but that would make it all more complicated. I guess we do not want that kind of tables.

Finii commented 9 months ago

Configure time decisions and header only files do not mix (so easily). Took me some time to realize the problem.

Finii commented 9 months ago

Tried to address nitpicks.

Finii commented 9 months ago

Start from scratch, force push. Updated the description in the top of this page.

Tested on Ubuntu, Alpine, MacOs (M2, x86), Windows

Finii commented 9 months ago

We could make pow10 part of GUL. And maybe error_in_ulp(). Both in a later PR.

Invite to discuss @alt-graph

gul-cpp / gul14

Bugfix inaccurate powl #78