my view of the nan/inf arithmetic

Submitted by: whatever

I am new to REBOL, and i ask u to forgive me if the question i arise is a decided matter and not a subject to discuss.

I have done a little search, and from what i understand, floating point arithmetic (FPA) in REBOL is intentionally limited in a way that operands can not be Not-a-Numbers or +/-Infinities. I am not sure of what made the designers of the language to bring forth such a limitation. Probably the simple fact that these singular values are not represented by digits and thus require a special case in REBOL code.

I have also found attempts of the users of REBOL to implement proper NaN/Inf handling on top of REBOL, which have obviously proven unsuccessful and have been considered as exploiting a bug of the language or compiler.

This fact alone is a testimony to the importance of having such a behavior available in the language. Yet it somehow remains ignored. Look at what came of C/C++! Most features were not designed, but found their way around the design. Now, do u like it?

My current involvement in REBOL is only at a stage of deciding if it can be useful to me. But the more i learn it the more i think that REBOL's approach to programming has its point and that it might just one day make our syntactically static languages obsolete. And when i discovered this limitation of floating point arithmetic, i wasn't surprised, but it saddened me.

My point is that not every math operation is performed on USD or Euro. I often make some more or less scientific calculations, and when i make them i do not even consider languages without NaN/Inf handling as capable of FPA at all. Why?

An example. I have an algorithm that takes a matrix of values and transforms it into another matrix of values, in dozens of steps. The data provided to it is empirical and might be defined not fully, or might contain invalid or doubtful values, which indeed may yield doubtful results or may be handled in some of the steps. But no matter what the data is, and how it is processed, the algorithm sincerely calculates the result. I tend to use D for such tasks, because of its silent FPA and the complex numbers support, and this is what i get:

no special cases at all: the common algorithm works fine in all situations; no checks on every second line and all that stuff
i don't have to waste my time on thinking "what zero or inf value of calculation stage N means and how to handle it" (not that i can handle such a task anyway, intermediate values might not have any meaning)
the probable doubtfulness of the result fully reflects that of the source data, and is expected, even encouraged
code is as fast as it should be

Implementing the same with all the checks and handling would require at least 3-4x the time, and at least 2x the code. Usually there is a way to reinvent your own floating point class that works silently but then arises the problem of interfacing this class with the rest of the language, what turns math programming from grace to torture. That's why, again, i do not even consider using languages like, say, Python for FPA.

You think 1/0 is a wrong operation, but any educated person would expect it to yield +inf. You think sqrt(-1) is a bad idea, but it simply gives you the imaginary 'one'. You think NaN+NaN/NaN will break the program, but it doesn't. This is the way our CPUs work. IEEE are not some bunch of mindless zombies who don't even know how to design things. Thing is, I never use any checks on floating point values, finding them redundant and destructive in my applications.

Now, why I am not happy with D? Because i often need reflectivity. Code that is given another (source) code to execute. Also I need graphics, GUI. D lacks both.

I strongly recommend to think again on the matter of NaN/Inf handling, and how its presence or absence will affect REBOL code.

I hope i have presented you some meaningful view on the subject. And don't forget that if you will not design the feature needed by coders, the feature will become a crutch, a rusty nail into the coffin of a beautiful language idea.

^{Imported from: CureCode [ Version: alpha 111 Type: Wish Platform: All Category: Math Reproduce: Always Fixed-in:none ]} ^{Imported from: https://github.com/rebol/rebol-issues/issues/1902}

Comments:

Rebolbot commented on Feb 19, 2012:

Submitted by: BrianH

See also #1029 and #1717.

The main problem is that for people doing scientific calculations Inf and NaN are values that are expected, and which they have been trained to deal with and require in some cases.

However, for people not doing scientific calculations (i.e. regular programmers), those are values that they need to avoid. These programmers would benefit from having errors triggered as soon as possible and as close to the source of the error as possible, so they can fix the erroneous code that generated these bad values. Propagating Inf or NaN instead of triggering an error immediately would be the worst thing possible.

You might notice that these are conflicting requirements. This is why scientists often use different programming languages than the ones that regular programmers use, or special versions of the regular languages, or special libraries for the regular languages. And this is why the vast majority of programmers don't use those languages, or versions, or libraries.

So, what would be the best way for REBOL to support those conflicting requirements?

Rebolbot commented on Feb 19, 2012:

Submitted by: Ladislav

"You think 1/0 is a wrong operation, but any educated person would expect it to yield +inf." - I am sorry, but have to disagree. An educated person should know that there is no convincing reason why the result shall be +inf. A NaN would be much more accurate.

Rebolbot commented on Feb 19, 2012:

Submitted by: Ladislav

Your idea that the arithmetic using +inf, -inf and NaNs could be more convenient is interesting. (Although the idea that 1 / 0 = +inf is debatable.)

However, we should not forget that there are also comparisons reflecting the ordering of numbers. I do understand how you would handle the arithmetic operations like addition, subtraction, multiplication and division. Your wish/proposal (implementable in REBOL in principle) does not solve more complicated issues of the other (e.g. comparison) functions. It is likely that what you find convenient in one case will strike someone else back in another.

Rebolbot commented on Feb 19, 2012:

Submitted by: Ladislav

"You think sqrt(-1) is a bad idea, but it simply gives you the imaginary 'one'." - well, that is the most convenient result unless you do want to stay in the range of real numbers and would have to check where a non-real slipped in. However, I do not want to argue that the imaginary one is generally more convenient than a real NaN or a triggered error in this case.

Rebolbot commented on Feb 19, 2012:

Submitted by: Ladislav

Andreas thinks that (at the cost of a more complicated interpreter) it would be convenient to make this FP alternative switchable.

Rebolbot commented on Feb 20, 2012:

Submitted by: BrianH

Haven't we already learned from the use of things like system/options/binary-base that modal settings are a bad idea? It's much better to do dynamic scope modes, that get reset as soon as the function that set the mode returns. Something like this (with almost any other name):

sci-mode [code that has inf and nan handling] ; after the block returns, the mode is reset

Dynamic scope, stack-bound, task-local. Doesn't affect other tasks unless they also call this function. If there's some kind of global mode test, make it a function instead of a setting. That way you won't have code that leaves the mode in a unexpected state - if the code returns, the mode is back to what it was when you called it.

Rebolbot commented on Feb 21, 2012:

Submitted by: Ladislav

However, if the simplicity (no different computation models allowed) is preferred, Andreas explicitly stated that he prefers the current error triggering and he does not want to replace it by the other alternative.

Rebolbot commented on Feb 28, 2012:

Submitted by: whatever

I agree that many (perhaps, a majority of) programmers wouldn't want nan/inf propagation. I also strongly agree that this nan/inf-enabled arithmetic should be bound to a scope; this way one is able to simultaneously use code that relies on one or the other FPA model.

As for comparative and arithmetic operations, they have an obvious meaning:

1) +inf and -inf are meant to be the biggest positive and negative numbers 2) Comparing +inf to +inf and -inf to -inf is mostly meaningless, and is only a subject of agreement; i think +inf should be equal to another +inf, as well as -inf to another -inf 3) NaN means 'undefined'. Comparing anything with nan is meaningless and any usual comparison operation with at least one nan argument should yield false result; why? because this is natural to expect: 1 > nan? false; 1 < nan? false; nan = nan? false 4) Note that this behavior of returning false can be negated if the programmer wants his condition to include nans as well: instead of ((a < 0) or (isnan a)) one may simply write (not (a >= 0)). This may seem tricky at first but it's not when get used to it. And it is natural: reading (not (a >= 0)) as "a is not bigger nor equal to zero" one would expect to be true for the case when a is nan. 5) Arithmetic operations (+-/*) on nan are undefined and it makes sense when they return another nan. 6) +inf - +inf = nan as with many other cases.

This can be continued based on elementary logic, school arithmetic and common sense. Meaningless cases are always a subject of an agreement. But! I have a feeling that all this arithmetic is already being done by a CPU itself, so there is no point to reinvent the wheel here.

sqrt(-1) leading to a nan or imaginary one is a matter of another discussion, since it involves complex arithmetic.

It is interesting though why you, Ladislav, would expect 1/0 to yield NaN. In classic math they teach in schools +1/0 is an +infinity. The rule simply too common to step off it. One very useful example of having 1/0 = +inf can be shown with a tangent function:

tan(x) = sin(x)/cos(x)
with x = pi/2 + 2n*pi, n is integer, you always get +inf

An example. Let's say you look for intersection of a horizontal line y=y1 with an arbitrary line y(x)=y2+x*tan, where tan is a cached tangent value, calculated or defined with line's creation. The answer is x0 = (y1-y2)/tan. It will work just fine for vertical lines (tan=+inf or -inf) without any special cases or other programming efforts, even though we have an infinite value and even divide on it. It will also work for the case when both lines are the same: y1 = y2, tan=0. The result is NaN, an undefined value, meaning that there can be no single intersection point. Quite straightforward, don't you think?

Rebolbot commented on Mar 2, 2012:

Submitted by: Ladislav

"NaN means 'undefined'. Comparing anything with nan is meaningless and any usual comparison operation with at least one nan argument should yield false result; why? because this is natural to expect: 1 > nan? false; 1 < nan? false; nan = nan? false" - this looks logical at the first sight; unfortunately, it violates a general logical principle of comparisons: reflexivity. You can violate any principle you like creating an exception to it. However, that is usually too costly to be worth trying.

Rebolbot commented on Mar 2, 2012:

Submitted by: Ladislav

"Note that this behavior of returning false can be negated if the programmer wants his condition to include nans as well: instead of ((a < 0) or (isnan a)) one may simply write (not (a >= 0)). This may seem tricky at first but it's not when get used to it. And it is natural: reading (not (a >= 0)) as "a is not bigger nor equal to zero" one would expect to be true for the case when a is nan." - this is just mentioning numbers, but you are out of luck when trying to apply it generally. For example, let't consider a case when you want to find out whether a given VALUE is in a given SERIES at index I. The comparison like VALUE = PICK SERIES I does not work simply because it fails for NANS, so you need to define a much more complicated version of it. (You are right that it can be done, but the legitimate question "Is it worth the complication?" shall be asked.)

Rebolbot commented on Mar 2, 2012:

Submitted by: Ladislav

"In classic math they teach in schools +1/0 is an +infinity." - I do have a degree in math and have never encountered it. That is because the

lim_{x->0} (+1 / x)

is actually a NaN. However, since

lim_{x->0+} (+1 / x) = +inf

and

lim_{x->0-} (+1 / x) = -inf

, it is possible to use a convention that

+1 / +0 = +inf

and

+1 / -0 = -inf

in engineering, which is what the engineers from IEEE picked for their FP arithmetic standard having available two "different" zeros. The trouble starts when you want to find out whether +0 and -0 are equal or not.

Rebolbot commented on Mar 2, 2012:

Submitted by: Ladislav

"One very useful example of having 1/0 = +inf can be shown with a tangent function: tan(x) = sin(x)/cos(x) with x = pi/2 + 2n*pi, n is integer, you always get +inf"

you made an error above confusing the Tangent and Arctangent functions. It is logical and correct to have

arctangent(+inf) = pi/2

since

lim_{x->+inf} arctangent(x) = pi / 2

, however, it is not true that

lim_{x->pi/2}tangent(x) = +inf

and therefore it does not make sense to define

tangent(pi/2) = +inf

In IEEE 754 the situation is specific, again, since there is no exact representation of pi/2; the "usual representation" is smaller than pi/2.

Rebolbot commented on Mar 4, 2012:

Submitted by: whatever

About +1/0 -- obviously i meant +1/+0. Sorry for being inaccurate.

"however, it is not true that
lim_{x->pi/2}tangent(x) = +inf
and therefore it does not make sense to define
tangent(pi/2) = +inf"

Don't be so formal. I only sought to show that a nan-aware code can be simple and effective. The program may not care about pi/2 here at all, since you will probably get tangent by dividing vector coordinates (well-known C function "tan2"), and this will automatically address most of the issues. Though, a test for polarity of zero may still be required, since we don't want it to be -0.

"The trouble starts when you want to find out whether +0 and -0 are equal or not."

Practical FP comparison is to look if a value is inside the given interval. So as long as i care, +0 and -0 can relate in whichever way they want to.

"this looks logical at the first sight; unfortunately, it violates a general logical principle of comparisons: reflexivity"

Well, yes. It's an issue. There's no reflexivity defined for special values. But i can live with it.

"you want to find out whether a given VALUE is in a given SERIES at index I. The comparison like VALUE = PICK SERIES I does > not work simply because it fails for NANS"

Okay. Agreed. Let there be simple bit-to-bit equality.

As for "Is it worth the complication?", isn't rebol's motto to simplify the code? It really does so for scientific calculations.

Rebolbot commented on Dec 6, 2012:

Submitted by: Ladislav

'As for "Is it worth the complication?", isn't rebol's motto to simplify the code? It really does so for scientific calculations. '

funnily enough, I am making my living doing scientific calculations. Thus, I think that my opinion on this has some significance. And, as you might have already guessed, I think that the oversimplification is just asking for trouble and it would surely complicate my life and code.

As to "don't be so formal" - I am sorry, but have to say that I am expected to know what I am talking about; that includes being sufficiently formal to be able to express exactly what I mean to express. I am not used to say "the result is 2 and don't be so formal telling me it is 3".

"So as long as i care, +0 and -0 can relate in whichever way they want to."

but if they are equal (which is what you do admit and I usually accept for granted) then +1 / +0 cannot differ from +1 / -0. (the trouble you are trying to ignore until you find out to have shot yourself in the foot)

Oldes mentioned this pullrequest on Jun 22, 2018: FEAT: added support for loading NaN (1.#NaN) and Infinite (1.#INF) values and using them in decimal computations

Rebolbot added Type.wish and Status.important on Jan 12, 2016

Oldes / Rebol-issues

my view of the nan/inf arithmetic #1902

Comments: