flix / flix

The Flix Programming Language
https://flix.dev/
Other
2.17k stars 153 forks source link

List.memberOf returns incorrect results for NaN #3843

Open soc opened 2 years ago

soc commented 2 years ago
def main(): Unit & Impure =
     // prints `false`, should print `true`
    List.memberOf(0.0/0.0, (0.0/0.0) :: Nil) |> println

The root cause is that, despite Eq being declared a lawful typeclass with reflexivity/transitivity laws, the Eq instance for floats does not use the correct comparison operation.

$FLOAT64_EQ$/$FLOAT64_NEQ$/$FLOAT32_EQ$/$FLOAT32_NEQ$ use an operation similar or equal to the one described in IEEE754 §5.11, but for reflexivity/transitivity to hold, those need an implementation similar or equal to IEEE754 §5.10.

magnus-madsen commented 2 years ago

Agreed. This revives our discussion of whether to have PartialEq and Eq.

soc commented 2 years ago

I'd guess it's largely a solved problem these days.

I think going the Eq/Id / Equality/Identity naming route is helpful to get the mental model right for readers, i. e. no subtyping relationship between those two.

(PartialEq/Eq kinda implied such a relationship, with the result that languages got it wrong in the end (see Haskell, Rust)).

magnus-madsen commented 2 years ago

Could you expand on that or do you have a reference?

soc commented 2 years ago

Sure!

The core insight is that PartialEq/Eq usually implies some form of sub-typing relationship with the expectation that the partial versions are "law-reduced" versions of the total ones, but have compatible semantics. For floating points, this is not the case and it's intentionally designed this way.

So what I usually do is to cleanly separate a notion of equality that is domain-specific (defined by the type itself), and a notion of identity (that is solely defined by a value's bits).

Functions then need to decide which typeclass (or both) they are actually interested in to provide the semantics expected by users.

I documented this approach a while ago Part 1, Part 2, Part 3 and Part 4.

Here is another language that successfully uses this design.

magnus-madsen commented 2 years ago

Interesting. We should take a closer look.

Quick question: Which equality is then used for e.g. a Set?

soc commented 2 years ago

Which equality is then used for e.g. a Set?

Depends on the intended use-case.

In the past I provided

magnus-madsen commented 2 years ago

@mlutze @JonathanStarup We should discuss this at some point.

soc commented 2 years ago

In terms of code (excuse my Flix-like pseudo-code), the design looks somewhat like that:

pub lawful class Identity[a]
  fun id(x: a, y: a) ...
  fun nid(x: a, y: a) ...
  // laws: reflexivity, symmetry, transitivity

pub lawful class Equality[a] ...
  fun eq(x: a, y: a) ...
  fun neq(x: a, y: a) ...
  // laws: symmetry

pub lawful class Sorting[a] extends Identity[a]
  fun sortsAbove ...
  fun sortsBelow ...

pub lawful class Comparing[a] extends Equality[a]
  fun lessThan ...
  fun greaterThan ...

with instances as expected:

instance Identity[Float64] {
    pub def id(x: Float64, y: Float64): Bool = $FLOAT64_ID$(x, y)            // IEE754 §5.10
    pub override def nid(x: Float64, y: Float64): Bool = $FLOAT64_NID$(x, y) // IEE754 §5.10
}

instance Equality[Float64] {
    pub def eq(x: Float64, y: Float64): Bool = $FLOAT64_EQ$(x, y)            // IEE754 §5.11
    pub override def neq(x: Float64, y: Float64): Bool = $FLOAT64_NEQ$(x, y) // IEE754 §5.11
}

... // for most (non-float) types Identity and Equality have the same implementation

I ended up with this naming because I felt that it was more indicative of the intended use-case and less focused on the mechanics ("partial").