Closed mtanneau closed 3 years ago
Hey @mtanneau , thank you for you interest in my project! You are absolutely correct that the types provided by MultiFloats.jl do not have proper IEEE Inf
and NaN
propagation semantics. Unfortunately, this is a rather difficult issue to fix. You see, arithmetic operations on MultiFloat
values are implemented using indeterminate forms such as ((a + b) - a) - b
which would evaluate to zero in exact arithmetic. When evaluated in floating-point arithmetic with round-to-nearest mode, these expressions allow me to compute the round-off error incurred in (say) the floating-point addition a + b
, so that the error can be propagated from one component of a MultiFloat
to the next.
These expressions are inherently unfriendly to Inf
values, since they produce indeterminate forms like Inf - Inf
and Inf / Inf
all over the place. Therefore, I completely ignored the issue of correctly propagating these values, in the interest of making the common case (arithmetic with finite values) as fast as possible. If you disassemble, for example, the implementation of the +
operator for Float64x4
, you'll notice that it's a straight line of native Float64
operations -- totally branchless and vectorizable. If we were to insert checks for Inf
and NaN
in all MultiFloat
arithmetic operators, I fear we would lose much of the performance uplift that MultiFloats.jl was designed to provide.
This isn't to say the situation is hopeless, though! I am certainly interested in making MultiFloats.jl compatible with as many libraries as I can. Does Tulip.jl crucially depend on Inf
/NaN
propagation everywhere, or only in select parts of the codebase? If the latter, it would be possible for me to provide a macro, like
@MultiFloats.IEEE begin
c = a * a + b * b
# ... code that depends on Inf/NaN propagation
end
# ... code that doesn't depend on Inf/NaN propagation
where we could just take the performance hit for the bits that matter.
In any case, thanks for bringing this issue to my attention. Violation of IEEE semantics, even if intentional, is a design deficiency that should be documented loud and clear on the MultiFloat.jl readme, so I'll go ahead and add a note about it.
Thanks for the answer :) TLDR: I'll handle the edge cases manually where they happen, to keep the codebase arithmetic-free.
I dug into my code a bit more. As far as I can tell, my issues comes from updates of the form l += a
, where l
may be infinite (a
is always finite). This is where NaN
s appear and things break.
It's not too wide-spread (and not in performance-critical parts), so it's OK for me to manually handle the infinite cases.
(FYI, there are other parts where a division by zero might happen, but in that case the result would be multiplied by false
afterwards, so the NaN
s are not a problem; as long as the convention Inf * false == NaN * false == 0
holds.)
Excellent! I'm glad to hear that the issue could be handled on your side.
By the way, since you're using MultiFloats.jl for optimization, you'll be glad to hear that I fixed a bug today where needless destructive cancellation would occur when subtracting two MultiFloats that are very close in magnitude. I actually have my own collection of nonlinear optimization routines for use with MultiFloats.jl, and this was driving me crazy. ("I'm turning up the precision, why isn't the objective going down???")
I'll make the v1.0 release of MultiFloats.jl imminently, with this fix and a couple enhancements.
v1.0 of MultiFloats.jl is now released! Closing this issue, as I've added a "Caveats" section to the README which documents violation of strict IEEE Inf
/NaN
semantics.
The convention for dividing by zero is not uniform across all levels of precision:
Is it possible to have
Float64x2
, etc follow the same convention asFloat64
? I have parts of my code in Tulip.jl that rely on this convention.x
y
x/y
+Inf
+Inf
NaN
-Inf
+Inf