As mentioned in #119, blocks with all subnormal but nonzero numbers are not encoded correctly because of reciprocal overflow, which can be avoided using the ZFP_WITH_DAZ compile-time macro. However, there are even larger, normal numbers that cause problems for zfp due to how they are currently converted to integers here:
https://github.com/LLNL/zfp/blob/c1845815ef1068b578b32823fb22310bd0b62b56/src/template/encodef.c#L42-L59
When the largest (in magnitude) normal value in a block is strictly smaller than 2-98 ≈ 3.2e-30 for floats or 2-962 ≈ 2.6e-290 for doubles, overflow in the computation of s occurs (even if ZFP_WITH_DAZ is enabled). While very small, both of these numbers are well above the smallest normal numbers FLT_MIN = 2-126 and DBL_MIN = 2-1022, respectively.
While less performant, a potential solution is to make use of division by 1 / s instead of multiplication by s for numbers in this range by computing 1 / s via ldexp directly, which is guaranteed not to underflow. Note that s (and 1 / s) is always an integer power of two, so the same result is obtained whether division or multiplication is used (when no overflow occurs). This solution is more general than ZFP_WITH_DAZ, as it also correctly handles all-subnormals.
As mentioned in #119, blocks with all subnormal but nonzero numbers are not encoded correctly because of reciprocal overflow, which can be avoided using the
ZFP_WITH_DAZ
compile-time macro. However, there are even larger, normal numbers that cause problems for zfp due to how they are currently converted to integers here: https://github.com/LLNL/zfp/blob/c1845815ef1068b578b32823fb22310bd0b62b56/src/template/encodef.c#L42-L59 When the largest (in magnitude) normal value in a block is strictly smaller than 2-98 ≈ 3.2e-30 for floats or 2-962 ≈ 2.6e-290 for doubles, overflow in the computation of s occurs (even ifZFP_WITH_DAZ
is enabled). While very small, both of these numbers are well above the smallest normal numbersFLT_MIN
= 2-126 andDBL_MIN
= 2-1022, respectively.While less performant, a potential solution is to make use of division by 1 / s instead of multiplication by s for numbers in this range by computing 1 / s via
ldexp
directly, which is guaranteed not to underflow. Note that s (and 1 / s) is always an integer power of two, so the same result is obtained whether division or multiplication is used (when no overflow occurs). This solution is more general thanZFP_WITH_DAZ
, as it also correctly handles all-subnormals.