Open sorawee opened 5 years ago
I think that silently throwing away information about inexactness should never be the default behavior. If contamination has occurred then that is something that needs to be dealt with explicitly. Unsignaled loss of precision (i.e. implicitly converting a inexact number to an exact number) can be extremely dangerous and IIRC was one of the points of discussion during the original development of the numerical section of Scheme standard. I'm not saying that the suggestion on the table is a bad idea, but the consequences of such a change would need be be very carefully considered across a wide range of use cases.
EDIT: I see I missed a slight subtlety, however, let me give a more annoying example.
What should (min 1.0 1)
return?
I think (min 1.0 1)
and (min 1 1.0)
should both return 1.0
, an inexact number. It should go like this:
(min x y)
case 1, x < y: x
case 2, y < x: y
case 3, else x = y or incomparable: "contaminate" with inexactness as the current version does
The reason I think this is better than (argmin identity xs)
and (argmax identity xs)
is because argmin
and argmax
depend on order. We should make sure min
and max
are still commutative. The expression (equal? (min x y) (min y x))
should always be true.
> (argmin identity (list 1.0 1))
1.0
> (argmin identity (list 1 1.0)) ; this use of argmin depends on order
1
> (min 1.0 1)
1.0
> (min 1 1.0) ; the min function should not depend on order
1.0
Is it possible to add +inf.0
and -inf.0
as an exception of the coercing rule, so:
(min 3 +inf.0) ; ==> 3
(min 3 -inf.0) ; ==> -inf.0
(min 3 999999.999) ; ==> 3.0
(min 3 1000000) ; ==> 3
(min 3 (expt 10. 300)) ; ==> 3.0
(min 3 (expt 10. 310)) ; ==> 3 ; weird
Another interesting case to remember:
(min 3 +nan.0) ; ==> +nan.0
Also
(min 0.0 -0.0) ; ==> -0.0
(min -0.0 0.0) ; ==> 0.0 ; weird
An alternative to use #f
instead of +inf.0
and define
(define (min? x y)
(and x y (min x y)))
Getting the number +inf.0 as a result means there "a number larger than any representable floating point number". So +inf.0 needs to be inexact, and therefore, the current convention of min/max is doing the right thing.
As I understand Sorawee, an exact version of infinity, say, -inf would work as the initial value, when finding a maximum of a list of exact numbers. However changing the number system to include exact versions of infinity seems a bad idea: unless Chez Scheme changes too, it will be difficult (if not impossible) to implement in an efficient manner.
A simpler, backwards compatible solution is to introduce a second set of min/max functions, say, exact-max and exact-min that works on exact numbers and interprets ±inf.0 . This will not slow down any existing programs using min/max.
Another backwards compatible solution is (copying Gus's idea) to choose two values, say #f and #t to mean -infinity and +infinity respectively and extend min and max to handle these. A choice is to be made what the value of (min #f -inf.0) is. If #f is interpreted as "exact negative infinity" and -inf.0 as "a very, large, negative number", then (min #f -inf.0) = #f makes sense. This solution will slow down existing programs.
I prefer
(min #f 1 2 3) ; ==> 1
(max #f 1 2 3) ; ==> 3
i.e. #f
means "please ignore me". But I understand that in some case it would be better to have the option to use #t
and #f
as +exact-infinity
and -exact-infinity
.
Use of #t
and #f
in max
and min
currently (correctly imo) is a contract violation. Implicit type conversion is already at issue here, and many coming from other languages might expect the mapping to be #f -> 0
#t -> 1
. There is also the nasty issue that the return type of max
and min
would depend on the exact values it receives, which is almost certainly an undesirable behavior since it will make the functions completely impossible to reason about.
The current behavior is arguably correct in its own way. Suppose we were computing (max 0.999 999999/1000000)
. The greater of these two numbers would be 999999/1000000
if we merely tested them with <
, but in the bigger picture, it's quite likely that an inexact number like 0.999
is really supposed to represent 1, which would make it a greater number after all. Likewise, in a case like (min +inf.0 3)
, it's correct to return 3.0
rather than 3
, because it's always possible this particular +inf.0
was just a really poorly rounded 2.
In the bigger picture, the best choice of identity element for min
and max
would be tailored to the type they're being used with. For instance, if people use max
on exact nonnegative integers, then they'll expect its identity element to be 0
, because that's the one exact nonnegative integer that's less than all the others.
Racket's numeric tower isn't conducive to talking about "all the others," because it's already extended with so many different things. If it's never extended again, then -inf.0
is a pretty clear choice of identity element for max
because it's the one non-NaN real number that's less than all the others, but it's all too easy to imagine that someday Racket might support an exact negative infinity, or even user-defined notions of negative infinity, at which point the choice of -inf.0
would be regrettable.
So users who do want to deal with universal properties like "the one number that's less than all the others" will really have a better time if they use types which are less extensible than Racket's "number" type. Once they're willing to take that approach, it's not hard to define specialized functions like exact-nonnegative-integer-max
where the identity is 0
and extended-exact-rational-max
where the identity is some user-defined representation of exact negative infinity.
Just a side comment. Currently Racket also gives:
> (max 0 -0.0 )
-0.0
> (max 0 +0.0)
0.0
Even with float propagation, the first result in particular slightly disturbing:
> (/ 1. (max 0 -0.))
-inf.0
@rocketnia
it's quite likely that an inexact number like
0.999
is really supposed to represent 1
Comparisons should not be based on guesses but on rules. Rules make computation predictable, predictability reduces surprise and bugs.
If the first number ought to be 1. because some calculation has numerical errors, then the issue is with the upstream calculation, not with max
.
Floating point numbers are a certain system of rules for guessing. Someone who doesn't consider any numerical error to be acceptable for their use case shouldn't use floats, and someone who does might find 0.999
to be an acceptable approximation of 1. Numerical error might be tricky to manage, but it isn't in itself a bug.
Floating-point numbers unfortunately carry no information about the amount of precision they have. When max
receives 1.0
and 2
, all it knows about the precision of 1.0
is that it's not dealing with something that's imprecise to the point of degeneracy (+nan.0
) and it's not necessarily dealing with something that has infinite precision either (exact 1
).
Maybe the caller knows that the 1.0
they're passing in is precise to within a tolerance of |e| <= 1/10
and thus that an exact result of 2
would make sense in this context, but the caller doesn't do anything to inform max
of this information. Instead, for all max
knows, the tolerance might be |e| <= 100
, in which case its result of 2.0
is only slightly more precise, to within |e| <= 99
. That's still some amount of imprecision, so an exact result of 2
isn't justified.
@Metaxal: I just tried with the float-float case, and I got
(min 0.0 -0.0) ; ==> -0.0
(min -0.0 0.0) ; ==> 0.0
I expected min
and max
to be commutative.
@sorawee: Your initial use case just bite me. I need something like
(define x (something))
(display (+ x 1))
(display (* x 2))
(display (max x 3))
and it was very handily to use +inf.0
and -inf.0
instead of #f
and #t
. My solution was to program my version of min
and max
that are like what you suggested.
(max 1.0 2)
results in2.0
. I propose that it should return2
.For me, I mostly use
max
andmin
when I write dynamic programming algorithms. The initial value in these algorithms is usually+inf.0
or-inf.0
. But then(min +inf.0 3)
results in3.0
, which is a real hassle. It also causescheck-equal?
andequal?
to fail.Currently, I have been using
(define (max . xs) (argmax identity xs))
to workaround this problem.