Open Alistair-Johnson opened 5 years ago
I disagree with this personally.
Float, Double, BigDecimal and other types are approximation types. Virtually everyone understands that they cannot be accurate representations of real numbers.
That said, they do fairly well. So, it isn’t that BigDecimal isn’t at all commutative: it is very nearly commutative up to the accuracy limits of the type.
We could define a weaker Eq instance for these types which would declare such values equivalent and the laws would pass, and it wouldn’t be inappropriate.
In many ways Eq is more problematic. Unless you are using these types as constant tokens, you very frequently won’t want to use exact equality.
Cc @non
@johnynek Sure I get those points....but it is "very nearly...", "almost" , "in the real world" would be another.
... but we do get failed tests. That's why i suggested a move to alleycats, as that's really why it exists.
I don't want to start anything "Big" here (yes, pun intended), I had thought of bringing this up before, when preparing for a talk on Laws in Berlin. But I decided to just leave it...until the broken test came up the other day.
In my talk I did indeed giver the example of an AlmostEq
instance, that I personally would (have) use: but it just feels wrong to have this in cats
How about we comment it out like we're already doing for double and float. That should be a backwards compatible hot fix. We can think about what to do long term afterwards :)
@LukaJCB This isn't, IMHO, super-urgent - nor do have any super-strong views on the matter. So yes, we could comment out the test but..,
As there is currently time and (hopefully) an abundance of goodwill, might be worth discussing the long term a bit more, first? Better might be to bundle this up with the more general cats/algebra discussion anyway
One last comment for now... by giving an AlmostEq
instance is both useful and at the same time illustrative that it's "not quite right but likely good enough".
Which is then a perfect opportunity to say something like "SO USE SPIRE", and getting more folk to do that would be even better! So... perhaps an opportunity for cross-selling?
Spire has these instances you are trying to push into alleycats:
https://github.com/non/spire/blob/master/core/src/main/scala/spire/std/double.scala#L11
yes, you could use computable real at great cost to performance. Is that what you meant?
I think it is more interesting going the other way: e.g. TPUs are all about being less accurate and going even faster: https://cloud.google.com/tpu/
or facebook's recent work on less expensive floating point: https://code.fb.com/ai-research/floating-point-math/
I personally have used Rational
with great success, mainly as a "drop in" for testing along with a AlmostEq[Double]
, and then using Double
in production. There's no doubting they are faster, are used in DB's and so on
Actually, all I'm suggesting is that we simply move the current instance to alleycats.
AlmostEq
is going to be difficult to define in a law-abiding way, because the loss of precision depends on the chain of computations.
No hard feeling about whether those instances should be in cats-kernel
or alleycats
; but those instances have a place -- without them, it is not possible to write generic code that works on exact and approximate number types.
Concerning binary compatibility, Spire defines its own additive group and multiplicative monoid instances for the number types, so it will not be affected.
Spire has also the same problem when using limited range primitive types such as Int
or Long
; we decided to consider them as law-abiding when the result of an operation is within the range, and having undefined behavior otherwise. Spire has ugly machinery to test laws for those types. I wonder if a similar construction could be done to test floating-point/decimal numbers (interval arithmetic?).
Found it... there is FPApprox
in algebra
Interesting the use in algebra - uses both a restrictive range and approximation on the full range...but scala.js turned of...oops
but those instances have a place -- without them,
Absolutely! So, as another example option, could be that we add something like FPEqual
to cats-laws or alleycats-laws, both for cats tests and external use. Or...??
So I have a suggestion:
I have added a class to the tests in KernalLaws, so is private for now. It could be made public. The tests are then modified so Float, Double, BigDecimal
all pass. Personally, I'm Ok with the the instances in kernel
, but TBH, they are only useful with some form of approximate equals. So we should supply something ...
What do you think, in principle? The details can be sorted later if agreed, it makes sense to agree the approach first
/**
* The mathematical definition of equivalence excludes
* 'the relation "is approximately equal to" between real numbers'
* https://en.wikipedia.org/wiki/Equivalence_relation#Relations_that_are_not_equivalences
*
* But in science and engineering, we have to deal with quantities that are not
* exact, but come with some margin of error, such as "98.2 plus or minus 0.6".
*
* Moreover, the engineering units often used to store such values are themselves not exact,
* such as IEEE numbers. This class allows us to compare such in-exact quantities
*/
case class Tolerance[@sp A](val epsilon: A)(implicit A: Numeric[A]) extends Eq[A] {
def eqv(x: A, y: A): Boolean = (x == y) || A.lt(A.abs(A.minus(x, y)), epsilon)
}
// In LawTests..
implicit val floatEq = Tolerance[Float](10f)
checkAll("CommutativeGroup[Float]", CommutativeGroupTests[Float].commutativeGroup)
How would you choose the "epsilon"? I have the impression that it would depend on the range returned by the Arbitrary[A]
in the scope.
@johnynek do you know how the reduced precision types you mentioned are validated? Maybe there is prior art we could use?
How would you choose the "epsilon"? I have the impression that it would depend on the range returned by the Arbitrary[A] in the scope.
Quite correct. I had to cut down the range to make this sensible in the tests:
implicit val arbFloat: Arbitrary[Float] = Arbitrary(arbitrary[Short].map(_.toFloat/3))
Really, as the simple but common implementation works, it is down to the user to supply a meaningful value based on usage. I've used such for maintaining an online average, so you are just combining values that are very close - so epsilon is very small. In the tests, random numbers that are huge...also have ridiculously huge epsilon, to the point they are meaningless.
So the goal here is to find something usable, but simple. I think the "flash" solution could come later if needed, and may be better in spire or algebra, I just don't know right now.
I like the idea of testing number types up to a certain precision, but I dislike the idea of making Eq
violate one of his laws: with the definition above it is no longer transitive.
It seems we are making Eq
lawless to force CommutativeGroup[Double]
to be law-abiding.
Well sure. But currently none of the Double
instances are law abiding. And that new Tolerance
is almost lawful - it passes the scalacheck tests even though we know they occasionally will fail. But if epsilon is set to zero, then it is lawful 😸
At the end of the day, perhaps approximate types just aren't lawful, so we have to pass on that responsibility to the user to ensure lawfulness within their context.
Really, that's why I originally suggested alleycats. But we can't do that now anyway.
FWIW this is still an issue, and tests are still occasionally failing because of this.
FWIW, this error now happens more often on 2.13-RC1
Yes...This will be a big issue for binary compatibility, so now that is out of the way...
Currently there is an instance for
CommutativeGroup[BigDecimal]
in kernel.We know that
Double
andFloat
are not associative.As
BigDecimal
is also not associative, even if it is more almost so thanDouble
andFloat
, the suggestion is that we deprecate this instance, and move it toAlleycats
.A recent travis-ci build from @barambani proved this:
This will also impact algebra, so /CC @denisrosset @johnynek, who could, of course, choose to use the proposed
alleycats
instance instead.