LLNL / zfp

Compressed numerical arrays that support high-speed random access
http://zfp.llnl.gov
BSD 3-Clause "New" or "Revised" License
768 stars 155 forks source link

Tiny but above subnormal numbers not handled correctly #209

Open lindstro opened 1 year ago

lindstro commented 1 year ago

As mentioned in #119, blocks with all subnormal but nonzero numbers are not encoded correctly because of reciprocal overflow, which can be avoided using the ZFP_WITH_DAZ compile-time macro. However, there are even larger, normal numbers that cause problems for zfp due to how they are currently converted to integers here: https://github.com/LLNL/zfp/blob/c1845815ef1068b578b32823fb22310bd0b62b56/src/template/encodef.c#L42-L59 When the largest (in magnitude) normal value in a block is strictly smaller than 2-98 ≈ 3.2e-30 for floats or 2-962 ≈ 2.6e-290 for doubles, overflow in the computation of s occurs (even if ZFP_WITH_DAZ is enabled). While very small, both of these numbers are well above the smallest normal numbers FLT_MIN = 2-126 and DBL_MIN = 2-1022, respectively.

While less performant, a potential solution is to make use of division by 1 / s instead of multiplication by s for numbers in this range by computing 1 / s via ldexp directly, which is guaranteed not to underflow. Note that s (and 1 / s) is always an integer power of two, so the same result is obtained whether division or multiplication is used (when no overflow occurs). This solution is more general than ZFP_WITH_DAZ, as it also correctly handles all-subnormals.