Open kryptan opened 6 years ago
So according to what I could find about the definition of total order provided by IEEE 754, part of it would require that negative zero and positive zero compare differently from the default implementation. In other words, -0.0 < +0.0
returns false by default, but for total order it would have to be true. It also describes how different representations of the same floating point number would be ordered, and how NaNs with different payloads would compare with each other.
This looks like a big problem because Rust's documentation for Ord
and PartialOrd
explicitly say:
Implementations of
PartialEq
,PartialOrd
, andOrd
must agree with each other.
If IEEE-conformant total order is implemented using Ord
, then it would disagree with the implementation of PartialOrd
for f32
and f64
, likely violating assumptions made by the standard library. I'm not sure what the exact consequences of that would be, but they don't sound good.
The only way I can think of to avoid this would be to have a separate trait specifically for floating-point total order.
(Btw, this is all based on what I could find from the section I linked above. If I misunderstood something or got something wrong, feel free to correct me.)
If IEEE-conformant total order is implemented using Ord, then it would disagree with the implementation of PartialOrd for f32 and f64
If we would implement Ord
for f32
and f64
in standard library then yes, this would be a problem and PartialOrd
should be compatible with Ord
. But this library instead provides wrappers for f32
and f64
, implementation of Ord
and PartialOrd
for these wrappers can agree with each other regardless of what standard library does for f32
and f64
.
This is definition of total order from the standard:
5.10 Details of totalOrder predicate
For each supported arithmetic format, an implementation shall provide the following predicate that defines an ordering among all operands in a particular format:
boolean totalOrder(source, source)
totalOrder(x, y) imposes a total ordering on canonical members of the format of x and y:
a) If x<y, totalOrder(x, y) is true. b) If x>y, totalOrder(x, y) is false. c) If x=y: 1) totalOrder(−0, +0) is true. 2) totalOrder(+0, −0) is false. 3) If x and y represent the same floating-point datum: i) If x and y have negative sign, totalOrder(x, y) is true if and only if the exponent of x≥ the exponent of y ii) otherwise totalOrder(x, y) is true if and only if the exponent of x≤ the exponent of y.
d) If x and y are unordered numerically because x or y is NaN: 1) totalOrder(−NaN, y) is true where -NaN represents a NaN with negative sign bit and y is a floating-point number. 2) totalOrder(x, +NaN) is true where +NaN represents a NaN with positive sign bit and x is a floating-point number. 3) If x and y are both NaNs, then totalOrder reflects a total ordering based on: i) negative sign orders below positive sign ii) signaling orders below quiet for +NaN, reverse for -NaN iii) lesser payload, when regarded as an integer, orders below greater payload for +NaN,reverse for -NaN.
Neither signaling NaNs nor quiet NaNs signal an exception. For canonical x and y, totalOrder(x, y) and totalOrder(y, x) are both true if x and y are bitwise identical.
NOTE—totalOrder does not impose a total ordering on all encodings in a format. In particular, it does not distinguish among different encodings of the same floating-point representation, as when one or both encodings are non-canonical.
@kryptan because people (myself included) still expect these wrappers to behave like a number. The aim is to be as close as possible to how ordering works for the native float types, whilst still upholding the guarantees of Ord. If negative zero compared differently to positive zero it would be useless for me and many other people.
It may make sense to have another type that implements the IEEE-defined ordering, but I suspect it will see less usage...
Hi there!
I wanted to revive this topic. As I understand it, the main argument here is that Ord
and PartialOrd
may disagree in the case of following IEEE.
But I see a more important disagreement here between Eq
and Div
:
// they are equal, but why?
assert_eq!(OrderedFloat(f64::NAN), OrderedFloat(f64::NAN));
// but division returns NaN (and that's correct!), but if they were truly equal it should result in 1.0
assert_eq!(
OrderedFloat(f64::NAN) / OrderedFloat(f64::NAN),
OrderedFloat(f64::NAN)
);
I don't think it's really a good idea to deviate from the standard here 🤔
Moreover, in hash sets they will be represented as single identical element, and that’s wrong. Such expectations may lead to many hidden issues
I don't think it's really a good idea to deviate from the standard here
Right now, it looks like the question is "Is it possible to comply with one standard (IEEE) without breaking compliance with another standard (the invariants the Rust standard library and ecosystem were promised) and, if not, which standard has worse consequences for breakage?"
But current implementation breaks mathematical rules and contradicts rust standard library (Div
impl, as I mentioned above) 🤔
Then it's a question of whether the two invariants in different parts of the standard library can be reconciled and, if not, which invariant is worse to break.
In general, I think it can be interpreted in different ways:
Implementations of PartialEq, PartialOrd, and Ord must agree with each other.
https://github.com/rust-lang/rust/issues/57157 (see follow-up PR there)
That is,
a.cmp(b) == Ordering::Equal
if and only ifa == b
andSome(a.cmp(b)) == a.partial_cmp(b)
for alla
andb
.
However it looks like these mentions have been removed from documentation of Ord
.
UPD:
Ok, current documentation of PartialOrd
is pretty clear actually (updated here):
The methods of this trait must be consistent with each other and with those of PartialEq. The following conditions must hold:
- a == b if and only if partial_cmp(a, b) == Some(Equal).
- a < b if and only if partial_cmp(a, b) == Some(Less)
- a > b if and only if partial_cmp(a, b) == Some(Greater)
- a <= b if and only if a < b || a == b
- a >= b if and only if a > b || a == b
- a != b if and only if !(a == b). Conditions 2–5 above are ensured by the default implementation. Condition 6 is already ensured by PartialEq.
but division returns NaN (and that's correct!), but if they were truly equal it should result in 1.0
There is no such invariant in floating point arithmetic or in Rust’s Div
trait. For example, IEEE 754 requires that INFINITY
is equal to itself, but also INFINITY / INFINITY
is not 1.0.
Also, 0 == 0, but 0 / 0 is not 1.
For example, IEEE 754 requires that INFINITY is equal to itself, but also INFINITY / INFINITY is not 1.0. Also, 0 == 0, but 0 / 0 is not 1.
Hm, right, that's correct 🤔
But these ones follows the standard at least (and division results generally based on math, there is an explanation for that). While NaN == NaN
is very confusing part and breaks the standard and their nature in general.
In fact, NaNs are introduced by IEEE 754 (at least their wide usage). And from math perspective it's undefined (which can't be equal to undefined). However, according to this argument, infinity should not be equal to infinity either, but it is in IEEE 😄
But these ones follows the standard at least (and division results generally based on math, there is an explanation for that). While
NaN == NaN
is very confusing part and breaks the standard and their nature in general.
The standard IEEE 754 totalOrder
predicate (which this issue is about) also includes cases where two NaN values have equal ordering. Note that the standard totalOrder
predicate is not the same as the standard comparison predicates. That is necessary both for this crate and for IEEE, because the standard comparison predicates do not form a total order. I’m not sure exactly what you are proposing to solve this problem, since neither this crate’s ordering nor the IEEE standard ordering fulfill your requirements.
Since various requirements (standard comparisons, standard ordering, cmp::Eq
invariants, etc.) are contradictory, it is impossible to offer a single type that meets all of them. We need multiple types, so users can choose an appropriate one for each use case:
f64
and f32
types directly. totalOrder
behavior can use (for example) the total_cmp
methods from libcore.OrderedFloat
type offered by this crate.NotNan
type offered by this crate.First of all thanks for detailed answers!
The standard IEEE 754 totalOrder predicate (which this issue is about) also includes cases where two NaN values have equal ordering.
Could you point me to this part, please?
Note that the standard totalOrder predicate is not the same as the standard comparison predicates. That is necessary both for this crate and for IEEE, because the standard comparison predicates do not form a total order. I’m not sure exactly what you are proposing to solve this problem, since neither this crate’s ordering nor the IEEE standard ordering fulfill your requirements.
Actually, I'm referring to the comparison behavior. Because as I mentioned earlier, Eq
impl of this crate returns true
for two NaN
s.
And this is not something expected to me. At least, Comparison with NaN claims:
But when at least one operand of a comparison is NaN, this trichotomy does not apply, and a fourth relation is needed: unordered. In particular, two NaN values compare as unordered, not as equal.
And:
The IEEE floating-point standard requires that NaN ≠ NaN hold.
Thus, I believe IEEE fulfill my expectations.
But these are good points actually:
- Users who need standard IEEE 754 comparison behavior should use the primitive f64 and f32 types directly.
- Users who need IEEE 754 totalOrder behavior can use (for example) the total_cmp methods from libcore.
And due to the contradictions, this is probably the only way:
it is impossible to offer a single type that meets all of them. We need multiple types
The standard IEEE 754 totalOrder predicate (which this issue is about) also includes cases where two NaN values have equal ordering.
Could you point me to this part, please?
IEEE 754-2019 §5.10 states:
For canonical x and y, totalOrder(x, y) and totalOrder(y, x) are both true if x and y are bitwise identical.
In fact, OrderedFloat
documentation claims it clearly:
NaN is sorted as greater than all other values and equal to itself, in contradiction with the IEEE standard.
So, I think there is no any issue here. As it's expected and explicitly stated.
It was just not entirely clear why this was so. But these contradictions explain the chosen behavior.
Thank you for quick and detailed replies!
Btw, I'm not sure there would be a contradiction if OrderedFloat(f64::NAN) == OrderedFloat(f65::NAN)
would return false
, but Ord
would be Equal
.
Based on Rust's documentation (see comment above), there is no mention about transitivity. It's seems logical to apply, but might not be necessary.
a == b if and only if partial_cmp(a, b) == Some(Equal).
It doesn't mean that a == b
must be true
for all partial_cmp(a, b) == Some(Equal)
. It tells us that a
& b
can be equal only if their ordering is Equal
WDYT? 🤔
It doesn't mean that a == b must be true for all partial_cmp(a, b) == Some(Equal).
That's exactly what "if and only if" means.
That's exactly what "if and only if" means.
Oh, right, it's biconditional statement
Thanks, for some reason I totally ignored "iff"
But there is no transitivity claimed. Note, it's about PartialEq & PartialOrd. Thus, I see it like:
Btw, I think you're talking about symmetry here rather than transitivity. Symmetry is the property that a <op> b
implies b <op> a
, whereas transitivity is the property that a <op> b && b <op> c
implies a <op> c
, so whilst iff
is symmetric, transitive and reflexive (a <op> a == true
) the relevant property here is symmetry.
Yes, you're absolutely right! Accurately noted, here, of course, symmetry was implied and “iff” fully corresponds to this.
In general, I have no more questions. For now, I consider the current behavior as a feature that is explicitly stated, and I was just trying to understand it better
Thanks everybody for answers!
For
OrderedFloat
:IEEE standard requires different NaNs to be not equal, positive NaN to be greater than +∞ and negative NaN to be less than -∞. It also requires -0 to be less than +0. Why non-conformant behavior was chosen? Wouldn't it make more sense to implement total order as defined in standard instead?
This Stack Overflow answer is presumably providing a fast way to implement IEEE total order comparison.