ashinn / chibi-scheme

Official chibi-scheme repository
Other
1.23k stars 141 forks source link

= can be not transitive #812

Closed FrankHB closed 2 years ago

FrankHB commented 2 years ago

R7RS requires = to be transitive.

However, with a case from this page: (= 9007199254740992.0 9007199254740993) => #t.

This is not conforming. The requirement is same from R5RS to R7RS, and there is a note in R7RS stating "the implementation approach of converting all arguments to inexact numbers if any argument is inexact is not transitive".

(I'm not sure whether this is intended in R6RS, since this ticket suggests inexact contagion is allowed generally.)

ashinn commented 2 years ago

Yes, that requirement dates back to at least R4RS, along with the note:

The traditional implementations of these predicates in Lisp-like languages are not transitive.

which is in fact still true today - Chez and Racket both return #t here. Gambit gets this right though.

mnieper commented 2 years ago

For R6RS, returning #t is correct because of the general caveat noted on page 42 that comparing inexact numbers isn't reliable (so that the question of exact transitivity is meaningless).

The R7RS, the same general caveat can be found on page 36, so I would say that Chibi's behavior is compliant.

FrankHB commented 2 years ago

Yes, that requirement dates back to at least R4RS, along with the note:

The traditional implementations of these predicates in Lisp-like languages are not transitive.

which is in fact still true today - Chez and Racket both return #t here. Gambit gets this right though.

Yes, Chez and Racket both return #t, as well as some other implementations like Gauche.

I've also reported this to Chez, but there is no reply yet.

As of Racket, it seems OK because Racket is not conforming to RnRS by default, and there is no similar guarantee in its documentation (it can even accept different number of arguments). It might deserve checking against its implementation of r5rs and r6rs modules, but I'm not that sure about how much conformance is actually guaranteed by the simulated environments.

OTOH, (as I've verified) Chicken 5.3.0, Guile 2.2.7 and Gambit 4.9.4 are conforming.

The divergence is not similar to the treatment of exact 0 divisors in the mentioned ticket, so there might be some other tips about implementation strategies I've missed.

(Interestingly, this note is not in R7RS. Is the tradition changed during these days?)

FrankHB commented 2 years ago

For R6RS, returning #t is correct because of the general caveat noted on page 42 that comparing inexact numbers isn't reliable (so that the question of exact transitivity is meaningless).

The R7RS, the same general caveat can be found on page 36, so I would say that Chibi's behavior is compliant.

For R6RS, returning #t is correct because of the general caveat noted on page 42 that comparing inexact numbers isn't reliable (so that the question of exact transitivity is meaningless).

The caveat is about to compare among inexact numbers, not covering the cases of comparing between an inexact number and an exact number here.

I see the intention of the note reflecting the precision problems of comparing small floating-pointer numbers. When practically dealing with such implementations of inexact numbers, it is often good to take things like machine epsilon into account, to prevent accumulated errors during multiple steps of computations. However, this is not always the case, in particular where the floating-point number (not necessarily from the result of some other computations) can assumed to be precise because the representation is exact, e.g. floating-point zeroes (+0.0 and -0.0). Actually I routinely turn off the [Wfloat-equal] warning of GCC in such contexts for this exact reason.

If not so, this seems strange. The requirement in R6RS (using the word "must") should be clear enough, and R7RS even has clearer wording. Even there are general rules (if any), it should overriding them, or it should simply not exist here (to prevent suspicious logical inconsistency problems). But I found nothing further explanations, like in the errata.

Moreover, notes are not normative rules; they should not determine the criteria of conformance by their own.

mnieper commented 2 years ago

For R6RS, returning #t is correct because of the general caveat noted on page 42 that comparing inexact numbers isn't reliable (so that the question of exact transitivity is meaningless). The R7RS, the same general caveat can be found on page 36, so I would say that Chibi's behavior is compliant.

For R6RS, returning #t is correct because of the general caveat noted on page 42 that comparing inexact numbers isn't reliable (so that the question of exact transitivity is meaningless).

The caveat is about to compare among inexact numbers, not covering the cases of comparing between an inexact number and an exact number here.

I don't see a meaningful difference here.

If not so, this seems strange. The requirement in R6RS (using the word "must") should be clear enough, and R7RS even has clearer wording. Even there are general rules (if any), it should overriding them, or it should simply not exist here (to prevent suspicious logical inconsistency problems). But I found nothing further explanations, like in the errata.

Moreover, notes are not normative rules; they should not determine the criteria of conformance by their own.

The note just highlights a specific case of what is said on page 11 (R6RS) about inexact numbers.

The question of transitivity is a mathematical property like whether + is addition. The reports say that (+ z1 z2 ...) evaluates to the addition of the arguments. Addition is clearly associative, so the report implies that + is associative.

However, in practice, this is not the case for adding inexact (IEEE) numbers. So claimed mathematical properties in the reports when applied to inexact numbers must have been meant to be taken modulo inexactness limitations. There is no reason to treat transitivity differently than the addition (or associativity) property, etc. Other interpretations lead to inconsistencies.

I'm not fully satisfied with the wording in both reports, though.

ilammy commented 2 years ago

I see the intention of the note reflecting the precision problems of comparing small floating-pointer numbers.

Magnitude of numbers affects the ideal, mathematical interpretation of comparison. Implementation practicalities limit significant digits of inexact numbers – that's where they are inexact.

With IEEE float64 you cannot compare 295147905179352825855.0 (268 − 1) exactly with integers, since its encoding can mean any integer in the [295147905179352809472, 295147905179352858624] range. They are indistinguishable, as far as IEEE is concerned, and this breaks transitivity.

Unless you're using arbitrary-precision inexact numbers and have a special category of "actually exact numbers that were read using inexact syntax", I don't think you can meaningfully achieve the behavior you seem to desire. While using something like MPFR for inexact numbers is a valid implementation approach, I doubt the reports have that in mind as a requirement.

mnieper commented 2 years ago

In fact, the sense of the implementation suggestion in R7RS saying that "converting exact numbers to exact numbers that are the same (in the sense of =) to them will avoid this problem, ..." is not completely clear to me. Firstly, there seems to be a circular reference to =; secondly, there is, in general, more than one exact number to convert to, and = will then depend on that choice.

ilammy commented 2 years ago

It's more of a question how do you interpret 1000.0 when it's being read.

Is this 1000 exactly? Or is it 1000.0 with 5 significant decimal digits? So now 1000.00 and 1000.0 are different numbers? Should they be different from 1000.00000000000001 – which cannot be represented exactly with float64? How about 295147905179352858624.12321 and 295147905179352809472.85932?

I don't think implementation can apply blanket conversion of inexact numbers to exact, without user input, unless it can distinguish "inexact numbers that are actually exact" – which does not make much sense, those are exact numbers to be begin with.

Or rather, you can approximate inexact numbers with the closest exact number to them, but this would still break transitivity of =, depending on what the inexact number implementation says about "closest".

It's all because of the confusion "non-integer numbers" = "IEEE floating-point" = "inexact" in a lot of implementations.

ashinn commented 2 years ago

Setting aside the unclear wording in the standards, it is certainly possible to implement = to always be transitive, and in fact several implementations get it right. There are a finite number of floating point numbers, just map each deterministically to a rational number (e.g. with (rationalize x ulp)).

If I can fix this in Chibi without much performance hit (in size or speed) I will do so. Possibly speed isn't crucial since you tend not to compare inexact with exact? I'm busy lately though, and this is somewhat lower priority.

mnieper commented 2 years ago

I think the issue was not whether it is possible to implement = to be always transitive (after all, even + can be made strictly associative for inexact values), but whether it is a meaningful requirement. And whether it is even required by the wording of the standard. I have doubts about both.

ashinn commented 2 years ago

For practical applications I don't think this is a big deal. For pedagogic purposes it matters.

luke21923 commented 2 years ago

For learners, it is nice if the rules of automatic numerical conversions are consistent across arithmetic operations and arithmetic comparisons. It means that you don't need to know which arithmetic procedure is applied to find out how the arguments will be evaluated.

Automatically applying exact to the inexact arguments of = (when there is at least one exact argument) would break this consistency. This is what MIT Scheme does to make = transitive on mixed exact/inexact arguments, and it leads to this:

1 ]=> (< (+ 9007199254740993 0.01) 9007199254740993)

;Value: #t

(the bignum on the left is converted to a floating-point number and then converted back to a smaller bignum, just before the bignum comparison).

Whereas in Chibi Scheme you get:

> (< (+ 9007199254740993 0.01) 9007199254740993)
#f

(the bignum on the left is converted to a floating-point number and stays that way during the floating-point comparison).

mnieper commented 2 years ago

From the wording of the specifications of exact and inexact in the report (R7RS), it seems (unless an implementation restriction is hit) that the following should be true:

(= z (exact z))

and

(= z (inexact z))

for all finite numbers z. If = is implemented by converting all arguments to exact ones when applied to arguments of mixed exactness, the second assertion would only hold when (exact (inexact z)) for any exact z would evaluate to z. This is, in general, not the case for a typical implementation, which supports only a finite number of inexact values but a countably infinite number (at least in principle) of exact values.

On the other hand, a typical implementation can implement exact to be a section of inexact, that is (inexact (exact z)) evaluates to z for any inexact z. If = converts all arguments to inexact values in case of mixed exactness, the above two assertions will then hold, but not the transitivity.

luke21923 commented 2 years ago

So, you imply that R7RS compliance (if transitivity of = applied to arguments of mixed exactness is required) cannot be achieved by either one of those two simple strategies (apply exact or inexact to all arguments of = in case of mixed exactness).

mnieper commented 2 years ago

So, you imply that R7RS compliance (if transitivity of = applied to arguments of mixed exactness is required) cannot be achieved by either one of those two simple strategies (apply exact or inexact to all arguments of = in case of mixed exactness).

It seems so (unless we have an order-preserving bijection between exact and inexact numbers). In fact, transitivity together with the two assertions about exact and inexact implies that both procedures are mutual inverses.

The way out is, in my opinion, to take the caveat about comparing inexact numbers as implying that transitivity in the case of inexact numbers may only be achieved approximately.

(I think there is an omission in the first sentence of your last paragraph. Did you mean: "that is (inexact (exact z)) evaluates to z for any inexact z"? You can just make an edit if that is the case.)

Thanks for the catch. I made the edit.

ashinn commented 2 years ago

The seeming paradox with (< (+ 9007199254740993 0.01) 9007199254740993) has nothing to do with transitivity or ordering.

The problem stems from (+ 9007199254740993 0.01) => 9007199254740992.0. You're adding a small, inexact amount to an exact number, creating an inexact result with a corresponding loss of precision (here small but you can give examples with larger loss). This kind of issue is expected in floating point arithmetic.

Where is it implied that (= z (inexact z)) be true? That clearly can't work.

I'm unconvinced ordering can't be made to work transitively. So far we have several implementations that seem to do it right and no counterexamples to prove them wrong.

ilammy commented 2 years ago

Speaking aside, I realized where my understanding was wrong.

Consider comparison between

  a = 295147905179352809472   b = 295147905179352825855.0   c = 295147905179352858624

If exact-inexact comparison is implemented by converting exact numbers into inexact representation (by deterministically choosing the closest inexact number), this breaks transitivity of equivalence because (= a b) and (= b c) are #t, but naturally (= a c) is not.

However, if exact-inexact comparison will convert inexact number b to some exact number deterministically, then transitivity will be observed, as exact b cannot be equal to a and c at the same time, which makes result of (= a c) irrelevant for transitivity.

The same reasoning applies to (total) ordering. Assuming conversion to inexact: if numbers are sufficiently close to be compared equal because of imprecision of inexact numbers – some numbers are definitely not < or > than each other, and so no violation of transitivity occurs. Otherwise, if all numbers are distinguishable, they'd be properly ordered. Conversion to exact will have these property as well, it does not matter for ordering.

Infinities and nans are not equal to any exact number, so again – no violation of transitivity here. < and > are always false if any argument is nan, so no violation here either. Infinities are distinguishable by < and > from any exact number, so transitivity will work for them as intended.

Since <= is effectively (lambda args (not (apply > args)), sans nans (and the same for >=), I believe the transitivity expectations would hold for them as well. At least, it should if exact-inexact comparison is handled by converting to exact.

mnieper commented 2 years ago

Where is it implied that (= z (inexact z)) be true? That clearly can't work.

In the specification of inexact, R7RS says that "If an exact argument has no reasonably close inexact equivalent (in the sense of =), then a violation of an implementation restriction may be reported."

Now, = is a binary-valued predicate, defining a discrete topology. "Reasonably close" can logically therefore only mean "equal (in the sense of =)" if the above sentence cited from the spec is supposed to have any meaning.

The following is subjective, of course, but I think that maintaining (= z (inexact z)) is more important and practically more useful (and also helps explain inexact) than forcing transitivity of =.

luke21923 commented 2 years ago

The seeming paradox with (< (+ 9007199254740993 0.01) 9007199254740993) has nothing to do with transitivity or ordering.

The problem stems from (+ 9007199254740993 0.01) => 9007199254740992.0. You're adding a small, inexact amount to an exact number, creating an inexact result with a corresponding loss of precision (here small but you can give examples with larger loss). This kind of issue is expected in floating point arithmetic.

My point is that the solution used by MIT Scheme to make = transitive when it is used with inexact arguments makes it harder to work with inexact numbers. Because when we work with both exact and inexact numbers, the arithmetic operations convert everything to inexact, while the arithmetic comparisons convert everything to exact. This solution buys nothing to the programmers working with inexact numbers (when would they ever rely on transitivity over inexact numbers?), but it costs them (they have to protect every exact number from being converted back and forth to a floating-point value by encapsulating them inside an inexact form).

As a side note, if a number is inexact, it means that the comparisons applied to it yield inexact (wrong) results. If the procedure = does not maintain its main property (returning #t if and only if its arguments are equal), I am failing to see an obligation of maintaining a secondary property like transitivity. MIT Scheme maintains the transitivity property of =, sure, but (of course) not its equality property (so what's the point?):

1 ]=> (= 9007199254740993 9007199254740993.0)

;Value: #f

(the equality property of = cannot be maintained over inexact arguments unless inexact numbers have the same infinite precision as the exact numbers).

I think it all boils down to what it means to use numbers of mixed exactness in a single expression. For programmers of numerical algorithms, it is a shortcut allowing them to, for instance, add an exact (secondary) number to an inexact (main) number, without having to explicitely convert the exact number to an inexact value. For programmers of exact and deterministic algorithms, it is a shortcut allowing them to, for instance, add an inexact (secondary) number to an exact (main) number, without having to explicitely convert the inexact number to an exact value (which doesn't make sense to me, in the context of an exact and deterministic algorithm). I think the only existing users of expressions of mixed exactness are those implementing numerical algorithms, and the main target of their computation is inexact. Therefore, switching to the behaviour of automatically converting their inexact values to exact values before a comparison will only lead to code like this:

(< (+ 9007199254740993 0.01) 9007199254740993)

to be rewritten to code like this:

(< (+ 9007199254740993 0.01) (inexact 9007199254740993))

and no one will have any use for the transitivity of comparisons over inexact numbers. Except perhaps for those who use inexact values as key to insert or extract elements in associative data structures (which doesn't make sense to me).

FrankHB commented 2 years ago

For R6RS, returning #t is correct because of the general caveat noted on page 42 that comparing inexact numbers isn't reliable (so that the question of exact transitivity is meaningless). The R7RS, the same general caveat can be found on page 36, so I would say that Chibi's behavior is compliant.

For R6RS, returning #t is correct because of the general caveat noted on page 42 that comparing inexact numbers isn't reliable (so that the question of exact transitivity is meaningless).

The caveat is about to compare among inexact numbers, not covering the cases of comparing between an inexact number and an exact number here.

I don't see a meaningful difference here.

Comparing between inexact numbers is almost always implemented by directly mapping to comparing the underlying representations, that is, floating-point numbers, for most implementations. Typically there is no need to introduce any conversions because only one floating-point format is supported. The result is generally unreliable because there is no way to know how precise it is by its value own, despite of the existence of the conversions. But by the floating-point format itself, any single defined operation is still accurate; if the user assume the value is accurate, it is just reliable. Moreover, comparing over floating-point values (excluding NaNs) is transitive by definition of the underlying operation.

Comparing between exact numbers and inexact numbers is radically different. Since exact numbers and inexact numbers are mandated to be different in the language specification, and they can't share a same representation, conversions are expected. Conversions in vain are wasteful, so there should be only one (same) conversion for each operand: either inexact to exact, or exact to inexact. Any conversion can be lossy with naive implementations, and the transitive guarantee is immediately destroyed once a lossy conversion is introduced. Such unreliable is totally out of the control from the users' view.

Note that lossless conversions are still practically implementable (e.g. MPFR). So it is at least not meaningless for =. (Whether appropriate to require this property is a further question.)

The reasons of the unreliableness are quite different, so they deserve different treatments in this topic.

If not so, this seems strange. The requirement in R6RS (using the word "must") should be clear enough, and R7RS even has clearer wording. Even there are general rules (if any), it should overriding them, or it should simply not exist here (to prevent suspicious logical inconsistency problems). But I found nothing further explanations, like in the errata. Moreover, notes are not normative rules; they should not determine the criteria of conformance by their own.

The note just highlights a specific case of what is said on page 11 (R6RS) about inexact numbers.

I see nothing close to this on page 11 except "approximate representation".

The question of transitivity is a mathematical property like whether + is addition. The reports say that (+ z1 z2 ...) evaluates to the addition of the arguments. Addition is clearly associative, so the report implies that + is associative.

However, in practice, this is not the case for adding inexact (IEEE) numbers. So claimed mathematical properties in the reports when applied to inexact numbers must have been meant to be taken modulo inexactness limitations. There is no reason to treat transitivity differently than the addition (or associativity) property, etc. Other interpretations lead to inconsistencies.

I agree = should not be special and different to handling like +, ideally. However, the reports do have different descriptions on requirements on different operations. I don't think the requirement of = is unintentional in all these reports, although it is quite suspicious in R6RS.

And I don't think the interpretation must follow the practice of IEEE arithmetic. In the contrast, R6RS (on page 11) explicitly states:

This report recommends, but does not require, that the IEEE floating-point standards be followed by implementations that use floating-point representations and that implementations using other representations should match or exceed the precision achievable using these floating-point standards.

FrankHB commented 2 years ago

I see the intention of the note reflecting the precision problems of comparing small floating-pointer numbers.

Magnitude of numbers affects the ideal, mathematical interpretation of comparison. Implementation practicalities limit significant digits of inexact numbers – that's where they are inexact.

True. But that note is specifically about the problem of accuracy, not magnitude. With given precision (an implementation parlance of the formal notion of "significant digits"), both accuracy and magnitude are limited.

Unless you're using arbitrary-precision inexact numbers and have a special category of "actually exact numbers that were read using inexact syntax", I don't think you can meaningfully achieve the behavior you seem to desire. While using something like MPFR for inexact numbers is a valid implementation approach, I doubt the reports have that in mind as a requirement.

Actually I'm not quite convincing of the feasibility of this requirement exactly for this practical reason: to introduce arbitrary-precision floating-point arithmetic seems too expensive in general. But there are also some counterarguments:

mnieper commented 2 years ago

For R6RS, returning #t is correct because of the general caveat noted on page 42 that comparing inexact numbers isn't reliable (so that the question of exact transitivity is meaningless). The R7RS, the same general caveat can be found on page 36, so I would say that Chibi's behavior is compliant.

For R6RS, returning #t is correct because of the general caveat noted on page 42 that comparing inexact numbers isn't reliable (so that the question of exact transitivity is meaningless).

The caveat is about to compare among inexact numbers, not covering the cases of comparing between an inexact number and an exact number here.

I don't see a meaningful difference here.

Comparing between inexact numbers is almost always implemented by directly mapping to comparing the underlying representations, that is, floating-point numbers, for most implementations. Typically there is no need to introduce any conversions because only one floating-point format is supported. The result is generally unreliable because there is no way to know how precise it is by its value own, despite of the existence of the conversions. But by the floating-point format itself, any single defined operation is still accurate; if the user assume the value is accurate, it is just reliable. Moreover, comparing over floating-point values (excluding NaNs) is transitive by definition of the underlying operation.

What you describe is that the Scheme procedures involving inexact numbers are functions in the mathematical sense. Accuracy is a different thing. Even if I know that 1.0 represents the exact value, (sin 1.0) won't, in general, represent the exact value of the sine of its argument.

Comparing between exact numbers and inexact numbers is radically different. Since exact numbers and inexact numbers are mandated to be different in the language specification, and they can't share a same representation, and conversions are expected. Conversions in vain are wasteful, so there should be only one (same) conversion for each operand: either inexact to exact, or exact to inexact. Any conversion can be lossy with native implementations, and the transitive guarantee is immediately destroyed once a lossy conversion is introduced. Such unreliable is totally out of the control from the users' view.

This is from an implementation perspective; from the abstract point of view (which is basically taken when exactness and inexactness are introduced), I still see no difference.

Note that lossless conversions are still practically implementable (e.g. MPFR). So it is at least not meaningless for =. (Whether appropriate to require this property is a further question.)

Meaningless in the logical sense; I never meant meaningless in the sense of whether it can be implemented or not. You don't have to go as far as GNU MPFR, by the way. Just think of a Scheme system that only has one number representation (all rationals) and each number comes in two colors (exact and inexact).

If not so, this seems strange. The requirement in R6RS (using the word "must") should be clear enough, and R7RS even has clearer wording. Even there are general rules (if any), it should overriding them, or it should simply not exist here (to prevent suspicious logical inconsistency problems). But I found nothing further explanations, like in the errata. Moreover, notes are not normative rules; they should not determine the criteria of conformance by their own.

The note just highlights a specific case of what is said on page 11 (R6RS) about inexact numbers.

I see nothing close to this on page 11 except "approximate representation".

My mistake. I think I meant the last paragraph of 3.2 on page 10.

The question of transitivity is a mathematical property like whether + is addition. The reports say that (+ z1 z2 ...) evaluates to the addition of the arguments. Addition is clearly associative, so the report implies that + is associative. However, in practice, this is not the case for adding inexact (IEEE) numbers. So claimed mathematical properties in the reports when applied to inexact numbers must have been meant to be taken modulo inexactness limitations. There is no reason to treat transitivity differently than the addition (or associativity) property, etc. Other interpretations lead to inconsistencies.

I agree = should not be special and different to handling like +, ideally. However, the reports do have different descriptions on requirements on different operations. I don't think the requirement of = is unintentional in all these reports, although it is quite suspicious in R6RS.

And I don't think the interpretation must follow the practice of IEEE arithmetic. In the contrast, R6RS (on page 11) explicitly states:

This report recommends, but does not require, that the IEEE floating-point standards be followed by implementations that use floating-point representations and that implementations using other representations should match or exceed the precision achievable using these floating-point standards.

That's true; but the standard certainly does not want to prohibit using the IEEE floating-point standards either.

mnieper commented 2 years ago

Actually I'm not quite convincing of the feasibility of this requirement exactly for this practical reason: to introduce arbitrary-precision floating-point arithmetic seems too expensive in general. But there are also some counterarguments:

* The requirement seems to have a long history. No documents I found suggest to formally remove the requirement.

Some parts of the report may have only survived because of copy-and-paste and because no one had the time to rethink a particular aspect. This is especially true for the requirement we are talking about as part of R6RS.

That said: As I wasn't there at the time, this is just a personal guess.

* The numerical tower is already a non-trivial and costly feature. So are `call/cc`, `dynamic-wind` and more. Many of them have stricter requirements of mandatory (i.e. there are less optional parts.) These features make Scheme distinguishing among many other languages.

Things like call/cc and dynamic-wind (which don't have to be costly just by being provided, by the way) are meaningful and useful. Forcing transitivity of = is not. Worse, it makes other assumptions (that are also implicitly in the reports) like (= z (inexact z)) impossible (unless one has a bijection between exact and inexact numbers).

FrankHB commented 2 years ago

The seeming paradox with (< (+ 9007199254740993 0.01) 9007199254740993) has nothing to do with transitivity or ordering. The problem stems from (+ 9007199254740993 0.01) => 9007199254740992.0. You're adding a small, inexact amount to an exact number, creating an inexact result with a corresponding loss of precision (here small but you can give examples with larger loss). This kind of issue is expected in floating point arithmetic.

My point is that the solution used by MIT Scheme to make = transitive when it is used with inexact arguments makes it harder to work with inexact numbers. Because when we work with both exact and inexact numbers, the arithmetic operations convert everything to inexact, while the arithmetic comparisons convert everything to exact. This solution buys nothing to the programmers working with inexact numbers (when would they ever rely on transitivity over inexact numbers?), but it costs them (they have to protect every exact number from being converted back and forth to a floating-point value by encapsulating them inside an inexact form).

As a side note, if a number is inexact, it means that the comparisons applied to it yield inexact (wrong) results. If the procedure = does not maintain its main property (returning #t if and only if its arguments are equal), I am failing to see an obligation of maintaining a secondary property like transitivity. MIT Scheme maintains the transitivity property of =, sure, but (of course) not its equality property (so what's the point?):

Besides the conformance ones, the real problem here is how to model the equality for Scheme numbers.

For many reasons, Scheme numbers can not be identical to mathematical ones, so the rules can differ. But too many differences will make the interface too confusing to use, reasonably closing to mathematical rules is general good.

The transitivity is surely not a distinguishing property to define = (it is actually shared by several operations in the reports), but it is at least clear in both Scheme and math (whatever the operand is a math number; actually it can work for any objects as operand of binary pure procedures). This is actually quite rare, because inexact numbers breaks far more assumptions used to work in math, so you will need more patches.

The exact semantics of = is not clear in the formal sense mostly because inexact numbers are artificial beasts out of mathematically defined concepts. Copying the rules from IEEE and other normative specifications will not help to fill the gap, because it will the leak implementation details and totally defeat the purpose of inexact numbers (instead of just having things like "floating-point numbers").

There are too many choices to build the mapping from the remaining things to ones defined by math. Operationally, voiding the transitivity requirement is definitely not a good beginning, because it does not reduce any of the choices; in the contrast, it will just provide more to choose.

1 ]=> (= 9007199254740993 9007199254740993.0)

;Value: #f

(the equality property of = cannot be maintained over inexact arguments unless inexact numbers have the same infinite precision as the exact numbers).

I think it all boils down to what it means to use numbers of mixed exactness in a single expression. For programmers of numerical algorithms, it is a shortcut allowing them to, for instance, add an exact (secondary) number to an inexact (main) number, without having to explicitely convert the exact number to an inexact value. For programmers of exact and deterministic algorithms, it is a shortcut allowing them to, for instance, add an inexact (secondary) number to an exact (main) number, without having to explicitely convert the inexact number to an exact value (which doesn't make sense to me, in the context of an exact and deterministic algorithm). I think the only existing users of expressions of mixed exactness are those implementing numerical algorithms, and the main target of their computation is inexact. Therefore, switching to the behaviour of automatically converting their inexact values to exact values before a comparison will only lead to code like this:

(< (+ 9007199254740993 0.01) 9007199254740993)

to be rewritten to code like this:

(< (+ 9007199254740993 0.01) (inexact 9007199254740993))

and no one will have any use for the transitivity of comparisons over inexact numbers. Except perhaps for those who use inexact values as key to insert or extract elements in associative data structures (which doesn't make sense to me).

Numerical programming is a plausible area here, but given the fact that most work are done using languages with explicit manifest typing, it is not as interesting as it sounds.

For a language with latent typing like Scheme, assumptions over operations (like transitivity) makes it far more easily statically analyze specific properties for the program transformation. Surely this won't make much sense to you (and actually most users), because such assumptions are mostly not intended to be consumed by programmers, but the program analyzers, esp. the optimizer. With transitivity on certain operands, an optimizer can just drop some of the operands without additional proofs. Such proofs are generally difficult, in particular within a language without a sophisticated manifest type system.

mnieper commented 2 years ago

There are too many choices to build the mapping from the remaining things to ones defined by math. Operationally, voiding the transitivity requirement is definitely not a good beginning, because it does not reduce any of the choices; in the contrast, it will just provide more to choose.

In fact, dropping this requirement is crucial to give any guarantees for exact and inexact.

Numerical programming is a plausible area here, but given the fact that most work are done using languages with explicit manifest typing, it is not as interesting as it sounds.

For a language with latent typing like Scheme, assumptions over operations (like transitivity) makes it far more easily statically analyze specific properties for the program transformation. Surely this won't make much sense to you (and actually most users), because such assumptions are mostly not intended to be consumed by programmers, but the program analyzers, esp. the optimizer. With transitivity on certain operands, an optimizer can just drop some of the operands without additional proofs. Such proofs are generally difficult, in particular within a language without a sophisticated manifest type system.

While I generally agree with you here, making use of the transitivity requirement is even in view of automated program analyzers quite far-fetched in my opinion.

In fact, along the same line, one can motivate that one should not drop that (= z (inexact z)) always yields true. But this requirement is actually useful in practise and may be effectively useful for program analyzers.

Just adding arbitrary requirements to make the language more rigid doesn't help.

FrankHB commented 2 years ago

Where is it implied that (= z (inexact z)) be true? That clearly can't work.

In the specification of inexact, R7RS says that "If an exact argument has no reasonably close inexact equivalent (in the sense of =), then a violation of an implementation restriction may be reported."

This is not a requirement, but a permission. It can't be relied on by a portable program.

Now, = is a binary-valued predicate, defining a discrete topology. "Reasonably close" can logically therefore only mean "equal (in the sense of =)" if the above sentence cited from the spec is supposed to have any meaning.

The following is subjective, of course, but I think that maintaining (= z (inexact z)) is more important and practically more useful (and also helps explain inexact) than forcing transitivity of =.

Also subjective, I fail to see how practically useful of maintaining (= z (inexact z)).

FrankHB commented 2 years ago

For R6RS, returning #t is correct because of the general caveat noted on page 42 that comparing inexact numbers isn't reliable (so that the question of exact transitivity is meaningless). The R7RS, the same general caveat can be found on page 36, so I would say that Chibi's behavior is compliant.

For R6RS, returning #t is correct because of the general caveat noted on page 42 that comparing inexact numbers isn't reliable (so that the question of exact transitivity is meaningless).

The caveat is about to compare among inexact numbers, not covering the cases of comparing between an inexact number and an exact number here.

I don't see a meaningful difference here.

Comparing between inexact numbers is almost always implemented by directly mapping to comparing the underlying representations, that is, floating-point numbers, for most implementations. Typically there is no need to introduce any conversions because only one floating-point format is supported. The result is generally unreliable because there is no way to know how precise it is by its value own, despite of the existence of the conversions. But by the floating-point format itself, any single defined operation is still accurate; if the user assume the value is accurate, it is just reliable. Moreover, comparing over floating-point values (excluding NaNs) is transitive by definition of the underlying operation.

What you describe is that the Scheme procedures involving inexact numbers are functions in the mathematical sense. Accuracy is a different thing. Even if I know that 1.0 represents the exact value, (sin 1.0) won't, in general, represent the exact value of the sine of its argument.

Well, yes and no. sin has the nature of the inexactness in its results represented by any positional numeral system because it is transcendental (and irrational), but many more computations are just inexact because of the input format. Even if the gap of Scheme numbers and math numbers are filled, this will not change: (implementable) inexact numbers are always expected some tolerance on errors. So users are required to differentiate which cases are really unreliable.

Comparing between exact numbers and inexact numbers is radically different. Since exact numbers and inexact numbers are mandated to be different in the language specification, and they can't share a same representation, and conversions are expected. Conversions in vain are wasteful, so there should be only one (same) conversion for each operand: either inexact to exact, or exact to inexact. Any conversion can be lossy with native implementations, and the transitive guarantee is immediately destroyed once a lossy conversion is introduced. Such unreliable is totally out of the control from the users' view.

This is from an implementation perspective; from the abstract point of view (which is basically taken when exactness and inexactness are introduced), I still see no difference.

I do illustrate the details in some implementation (like floating-point formats) because there are no clear definition of the exactly desired things. But here is not the case. The distinction of exact and inexact numbers is explicitly required in the language rules and exposed into the interface (Scheme procedures). Users must know the differences between them, instead of merely knowing they are all numbers.

Taking care about floating-point errors is a well-established caveat among various programming languages. The intention of allowing inexact numbers being implemented by floating-point numbers in Scheme also share the point: users should take care of exact vs. inexact numbers when using quite a lot (if not most) number interface in the language. On the other hand, transitivity here is a notable exception, because it works smoothly over all numbers (actually not only numbers, just because here it is provided under several procedures accepting numbers).

Note that lossless conversions are still practically implementable (e.g. MPFR). So it is at least not meaningless for =. (Whether appropriate to require this property is a further question.)

Meaningless in the logical sense; I never meant meaningless in the sense of whether it can be implemented or not. You don't have to go as far as GNU MPFR, by the way. Just think of a Scheme system that only has one number representation (all rationals) and each number comes in two colors (exact and inexact).

We will not have the problem of transitivity at the all if the every Scheme system is that simple...

If not so, this seems strange. The requirement in R6RS (using the word "must") should be clear enough, and R7RS even has clearer wording. Even there are general rules (if any), it should overriding them, or it should simply not exist here (to prevent suspicious logical inconsistency problems). But I found nothing further explanations, like in the errata. Moreover, notes are not normative rules; they should not determine the criteria of conformance by their own.

The note just highlights a specific case of what is said on page 11 (R6RS) about inexact numbers.

I see nothing close to this on page 11 except "approximate representation".

My mistake. I think I meant the last paragraph of 3.2 on page 10.

3.2 explicitly describes the requirement on exact arithmetic and the reliability. But it is not 3.4; it does not state that these are the merely required rules.

As of inexact arithmetic, 3.2 requires only quite a bit. This also does not imply the requirements can be strenghen elsewhere.

And it is logically legitemate to put additonal requirements elsewhere, because 3.4 is overall rules for all numbers, and rules like transitivity can be consistent with rules in 3.2 and 3.4.

So, I don't see the transitivity requirement is invalid. Unless there are materials convincingly proving this is really an unintentional mistake, I don't think it can be just ignored. Even if this is not that useful in practice, it is at least worth some documents.

FrankHB commented 2 years ago

There are too many choices to build the mapping from the remaining things to ones defined by math. Operationally, voiding the transitivity requirement is definitely not a good beginning, because it does not reduce any of the choices; in the contrast, it will just provide more to choose.

In fact, dropping this requirement is crucial to give any guarantees for exact and inexact.

  1. Some more detailed proof?
  2. I'm curious in this question as @ashinn: how do other implementation satisfying this requirement actually do? Are there buggy on exact and inexact? Or they simply do not conform to R7RS?

Numerical programming is a plausible area here, but given the fact that most work are done using languages with explicit manifest typing, it is not as interesting as it sounds. For a language with latent typing like Scheme, assumptions over operations (like transitivity) makes it far more easily statically analyze specific properties for the program transformation. Surely this won't make much sense to you (and actually most users), because such assumptions are mostly not intended to be consumed by programmers, but the program analyzers, esp. the optimizer. With transitivity on certain operands, an optimizer can just drop some of the operands without additional proofs. Such proofs are generally difficult, in particular within a language without a sophisticated manifest type system.

While I generally agree with you here, making use of the transitivity requirement is even in view of automated program analyzers quite far-fetched in my opinion.

In fact, along the same line, one can motivate that one should not drop that (= z (inexact z)) always yields true. But this requirement is actually useful in practise and may be effectively useful for program analyzers.

I don't see its usefulness, mainly because I can hardly imagine in which cases such patterns will be useful in a program or its immediate representation:

  1. For human programmers, one may get inexact numbers from computation results of numbers not in the range of rationals. But there are too few cases to deliberately produce an inexact value from an exact number and then compare with with other numbers.
  2. For machines, inexact numbers provide too few details (boundary, precision, ...) to aid the reductions. Why use built-in inexact numbers instead of specialized records for reasoning?

Just adding arbitrary requirements to make the language more rigid doesn't help.

True, but it is already there. And removal of an existing requirement will certainly reduce the current set of guarantees, without improving portability. (Probably the only lucky thing: it is buggy not only in one implementations; it seems not widely adopted.)

mnieper commented 2 years ago

Where is it implied that (= z (inexact z)) be true? That clearly can't work.

In the specification of inexact, R7RS says that "If an exact argument has no reasonably close inexact equivalent (in the sense of =), then a violation of an implementation restriction may be reported."

This is not a requirement, but a permission. It can't be relied on by a portable program.

What is implied is that when (= z (inexact z)) is not #t that either (inexact z) is an arbitrary value or an implementation restriction is reported. Any practically useful Scheme implementation should not produce arbitrary values for (inexact z).

In fact, the transitivity requirement for = can be fulfilled by returning #f whenever comparing exact and inexact values. But that is likewise not very useful.

Now, = is a binary-valued predicate, defining a discrete topology. "Reasonably close" can logically therefore only mean "equal (in the sense of =)" if the above sentence cited from the spec is supposed to have any meaning. The following is subjective, of course, but I think that maintaining (= z (inexact z)) is more important and practically more useful (and also helps explain inexact) than forcing transitivity of =.

Also subjective, I fail to see how practically useful of maintaining (= z (inexact z)).

@luke21923 has some arguments for it.

Anyway, you seem to approach the whole thing mostly from a language-lawyer's perspective. As this is not what I would like to do, I am refraining from further discussions here.

luke21923 commented 2 years ago

The report is not a formal document, it requires interpretation using common sense. If a change doesn't make sense, or is bad, then it is very likely that the report doesn't require it. So it is a good thing that you provided a rationale for your proposal.

For a language with latent typing like Scheme, assumptions over operations (like transitivity) makes it far more easily statically analyze specific properties for the program transformation. Surely this won't make much sense to you (and actually most users), because such assumptions are mostly not intended to be consumed by programmers, but the program analyzers, esp. the optimizer. With transitivity on certain operands, an optimizer can just drop some of the operands without additional proofs. Such proofs are generally difficult, in particular within a language without a sophisticated manifest type system.

This is interesting. I don't know enough about optimising compilers to tell how much more optimised a program can be with this change.

However, as I mentioned earlier, this potential speed gain comes at a sizeable cost in terms of code readability. Reading and writing arithmetical code is easier when mixed exact/inexact arguments are consistently converted to inexact numbers (for both + and <).

In Chibi Scheme, = is transitive if it is always applied to lists of (exclusively) exact arguments and to lists of (exclusively) inexact arguments. As proved below, R7RS-small does not require Chibi Scheme to make = transitive if it is ever applied to a list containing both exact and inexact arguments:

Language-lawyer proof

The rules stated in the definition of = are:

1) = returns #t if its arguments are equal, and #f otherwise 2) = is required to be transitive

Even though the report allows an implementation to use finite-precision floating-point numbers to represent all its inexact real numbers, such an implementation necessarily violates the above definition. For instance, here is a violation of rule number 1 in Chibi Scheme:

> (= 9007199254740992.0 9007199254740993.0)
#t

There is no requirements on the behaviour of such an implementation, concerning =, when the definition of = cannot possibly be respected. In particular, it is not specified whether the implementation must respect the complete definition on the maximum number of argument combinations, or if it must respect the maximum number of rules on all argument combinations, or if it must adopt any other behaviour.

There is a note (in R7RS-small page 36) explaining how an implementation can preserve rule number 2 on all argument combinations (if it so desires), when the implementation necessarily violates the definition of = because it represents inexact numbers as finite-precision floating-point numbers. This note states logical facts that can be deduced from previously stated requirements, it does not state any additional requirements. Also, the fact that rule number 2 contains the keyword "required" does not make it special or privileged compared to rule number 1, because rule number 1 is required as well.

FrankHB commented 2 years ago

The report is not a formal document, it requires interpretation using common sense. If a change doesn't make sense, or is bad, then it is very likely that the report doesn't require it. So it is a good thing that you provided a rationale for your proposal.

This is a document as a widely adopted de-facto standard with normative rules. The requirement is actually not a change, as it is kept in several last versions (but the wording is changed a bit). Nevertheless, I'm in doubt about whether it is really intended (in R6RS). However, there is still lack of the proof.

I have not make any proposal effectively, besides pointing out the things I've found. I will try to, if it makes help.

Reading and writing arithmetical code is easier when mixed exact/inexact arguments are consistently converted to inexact numbers (for both + and <).

Although I agree that ideally + and < should be handed in the same way in this aspect, I'm not very convinced by this point, due to the direction of reasoning.

To my experience, programmers will get things right easily once they know the exact rules, but not only the results, to determine the outcome.

For +, we don't expect the results can be exact if any one of the operand is inexact, because it makes no sense to guarantee any concrete results can be correct in such cases as per the math rules (pretending exact should not be correct). Breaking math properties on inexact number operands is a natural side effect. Notice althogh it is not required here, some language implementations allow to recover such properties when the users deliberately want them, even this is not allowed by the specification of the number operations (e.g. unsafe math operations including reordering the operands for + in languages using IEEE FP numbers).

However, for unary predicates like zero?, it should not follow the similar rules because it will be almost useless for inexact numbers if it does so. Since inexact numbers are actually allowed, better avoid such design. Instead, there are caveats to users about the accuracy issue.

Binary predicates behave more complex: as predicates, they should behave like zero? about the possible results; they arealso operations accepting more than one operand, so they allow the mixture of exact and inexact numbers as input.

The disagreement comes from the dilemma: when the behaviors conflict, and there are no math rules suggesting it definitely incorrect, which should be choosed for this?

Without considering the inpacts on the implementations, I choose the former. Generally, when implementing a pure function (without any side effects in the call), the range of the results of the procedure is a more significant property then how it constraints on the input.

(The nature of predicates are often reflect in the precedure's name. In Scheme, = occasionally does not have a ? postfix, but some other procedures with transitivity requirement do have, e.g. char=?; in some other designs of Scheme-flavor language like Kernel, =? is used instead of =.)

So, the fact = is a predicate is more important than it is a binary arithmetic operation here.

This interpretation also looks like intentional in R7RS: the second note in page 36 mentions = and zero? together to show how inaccuracy makes the results unreliable.

For instance, here is a violation of rule number 1 in Chibi Scheme:

Although the rule of equality is not very clear, it still allows the result, because, from R7RS:

6.2.4 ... It is important to distinguish between mathematical numbers, the Scheme numbers that attempt to model them, the machine representations used to implement the Scheme numbers, and notations used to write numbers.

The rule of the equality relationship is defined mathematically on mathematical numbers, because by common sense, there are no other definitions not exposing the implementation details available (except comparing the written forms in the source, which is absurd). Computations of inexact numbers are explicitly allowed using "approximate methods" in 6.2.4. So, this is not a violation.

The rule of transitivity is directly defined on "these predicates", i.e. it constraints the operands in a call, irrespective to the domains of the operands (so there is also no need to differentiate which kind of numbers it constraints for this purpose). By common sense, "transitive" is defined by transitive relationship in math, which may cover any 2 operands in a call of these procedures, including =. Other procudures also having this requirement also do not necessarily specify the relationship on the domains of the operands (e.g. char=?).

Further, the first note in R7RS page 36 strongly supports this interpretation. The note is new in R7RS, so it is very likely that the author of R7RS clearly know the potentional unclear problems about the old specification, so they finally add this note.

Also, the fact that rule number 2 contains the keyword "required" does not make it special or privileged compared to rule number 1, because rule number 1 is required as well.

True.

luke21923 commented 2 years ago

On the benefits of the proposal

I have not make any proposal effectively, besides pointing out the things I've found. I will try to, if it makes help.

I assumed that you were proposing to make Chibi Scheme automatically convert the inexact arguments of =, <, >, <= and >= to exact numbers (with exact), when at least one of the arguments is exact. This is the behaviour of MIT Scheme, for instance.

For + [...]. Breaking math properties on inexact number operands is a natural side effect. [...]

However, for unary predicates like zero?, it should not follow the similar rules because it will be almost useless for inexact numbers if it does so. Since inexact numbers are actually allowed, better avoid such design. Instead, there are caveats to users about the accuracy issue. [...]

zero? and = are useless for inexact (IEEE 754 binary64) numbers. Exact equality comparison involving an inexact floating-point number does not make sense:

> (= 3 (/ 0.6 0.2))
#f
> (zero? (- 3 (/ 0.6 0.2)))
#f

There is no point in trying to reverse that. Transitive or not, zero? and = (applied to inexact IEEE 754 binary64 arguments) are irrelevant. I really don't see how transitivity would make them useful again (except perhaps for the argument about optimisation that you mentioned previously).

The only comparison procedures that make sense when applied to one or more inexact (IEEE 754 binary64) arguments are <, >, <= and >=.

The disagreement comes from the dilemma: when the behaviors conflict, and there are no math rules suggesting it definitely incorrect, which should be choosed for this?

Without considering the inpacts on the implementations, I choose the former. Generally, when implementing a pure function (without any side effects in the call), the range of the results of the procedure is a more significant property then how it constraints on the input.

I am not convinced by that. In fact I don't understand how it is connected to any benefit of making = and < transitive.

I acknowledge that you are not convinced by the benefit of consistent conversion of arguments (for both + and <). I am not sure we will be able to agree on any of that.

This interpretation also looks like intentional in R7RS: the second note in page 36 mentions = and zero? together to show how inaccuracy makes the results unreliable.

Let's look at that note:

Note: While it is not an error to compare inexact numbers
using these predicates, the results are unreliable because a small
inaccuracy can affect the result; this is especially true of = and
zero?. When in doubt, consult a numerical analyst.

Yes, that note groups zero? and = together. That note says that comparison predicates applied to inexact numbers (presumably IEEE 754 floating-point numbers) are unreliable, and that this is especially true for = and zero?. It means that = and zero? are especially unreliable when applied to inexact (IEEE 754 binary64) arguments. In other words, as I said before, they are useless. I acknowledge that = and < are predicates.

Refutation of the language-lawyer proof

Now, let's talk about your arguments aimed at refuting my language-lawyer proof.

For instance, here is a violation of rule number 1 in Chibi Scheme: > (= 9007199254740992.0 9007199254740993.0) #t

Although the rule of equality is not very clear, it still allows the result, because, from R7RS:

6.2.4 ... It is important to distinguish between mathematical numbers, the Scheme numbers that attempt to model them, the machine representations used to implement the Scheme numbers, and notations used to write numbers.

The rule of the equality relationship is defined mathematically on mathematical numbers, because by common sense, there are no other definitions not exposing the implementation details available (except comparing the written forms in the source, which is absurd).

This invalidates my proof. It is not obvious (at least to me), but the report defines the rules of = as being valid over "Scheme numbers", not over the written form of the numbers in the source code. Two numbers in written form designate the same "Scheme number" if the procedure number->string returns the same result with both numbers as argument.

For instance:

> (number->string 9007199254740992.0)
"9007199254740992.0"
> (number->string 9007199254740993.0)
"9007199254740992.0"

Notwithstanding what the report says, I still consider the evaluation of (= 9007199254740992.0 9007199254740993.0) => #t to be erroneous.

Transitivity of <

= is useless on inexact finite-precision floating-point numbers, and, after all, I couldn't care less about what it returns (when applied to a mix of exact/inexact arguments). For the extremely rare cases (floating-point bit twiddling) where I would want to test the raw equality on inexact finite-precision floating-point numbers, I can afford to encapsulate the arguments inside inexact forms.

What really matters is the behaviour of the other comparison procedures <, >, >= and <=. Do the transitivity of those procedures is broken? I am trying to find three (exact and/or inexact) numbers num1, num2 and num3 such that:

(< num1 num2) => #t (< num2 num3) => #t (< num1 num3) => #f

or such that:

(<= num1 num2) => #t (<= num2 num3) => #t (<= num1 num3) => #f

But I didn't find any exemple so far.

If the transitivity of < and friends is not broken, then one solution would be: 1) To automatically convert the inexact arguments of = with the procedure exact, into exact numbers (when at least one initial argument is exact), and 2) To leave <, >, >= and <= alone (keep the current behaviour of converting exact arguments to inexact when there is a mix of exact/inexact arguments).

This would introduce an unattractive discrepancy in the behaviours of the comparison procedures though.

ashinn commented 2 years ago

I'm enjoying this discussion but having trouble keeping up, and also think the Chibi issue tracker isn't the best place for it. Someone should probably summarize and ask for wider feedback on the WG2 list.

For Chibi I remember I had already partially implemented this logic - bignum/flonum ordering works by converting the flonum to exact. I had overlooked this for fixnums and ratios so have fixed that.