shibatch / sleef

SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
https://sleef.org
Boost Software License 1.0
661 stars 131 forks source link

FP32 `hypotf_u05` and FP64 `hypot_u05` are inaccurate near underflow #469

Open ebavier opened 1 year ago

ebavier commented 1 year ago

The current Sleef_hypotf_u05 and Sleef_hypot_u05 functions are less accurate than the advertised 0.5001 ulp for input pairs whose absolute maximum approaches the underflow threshold.

Based on random testing, the maximum ulps error follows roughly:

0.5 + 2^min(-1, 1 - ilogb(amin(max(|x|,|y|))) - bias)

that is, the maximum error climbs to 1.0 ulp exponentially as the maximum of x and y approaches the subnormals.

E.g. we can find inputs that produce errors greater than the 0.5001 ulp advertised accuracy when the maximum absolute value of the inputs is less than and 0x1p-112 for fp32:

Sleef_hypotf_u05 (0x1.701acp-113, 0x1.3e8184p-113) = 0x1.e6c5ap-113   (0.500121759 ulp)
Sleef_hypotf_u05 (0x1.6548bp-124, 0x1.39c64cp-124) = 0x1.db81ap-124   (0.749980856 ulp)
Sleef_hypotf_u05 (0x1.2bca56p-125, 0x1.8fb876p-125) = 0x1.f3a694p-125 (0.999999970 ulp)

and less than 0x1p-1008 for fp64:

Sleef_hypot_u05 (0x1.1b6637b8c3bfbp-1009, 0x1.07d30419629b6p-1009) = 0x1.833167786cbbcp-1009 (0.500121710 ulp)
Sleef_hypot_u05 (0x1.78530f71a242dp-1021, 0x1.0687b83795c9ep-1021) = 0x1.cad947cd94c98p-1021 (0.999996123 ulp)
blapie commented 1 year ago

Hello! We are in the process of changing maintainers for SLEEF, pending issues and PRs will be addressed in timely manner. Apologies for any further delay, but this issue will be addressed. See discussion here #472