Nathan-Wall / proto

A programming language derived from JavaScript which emphasizes prototypes, integrity, and syntax.
Other
12 stars 1 forks source link

Direct NaN and computed NaN #74

Open Nathan-Wall opened 10 years ago

Nathan-Wall commented 10 years ago

Working with NaN can be confusing and result in some strange issues. Consider the following JS:

console.log([ 1, 2, 3, NaN ].indexOf(NaN));

This logs -1, not 3 as might be expected, because JS does a strict equality comparison (===) in indexOf and NaN !== NaN.

From a mathematical perspective

This actually makes sense from a mathematical perspective because NaN is the placeholder for indeterminate or undefined results of calculations. Therefore, two NaN results from separate calculations should not be considered equal.

For instance, it's not true that 0 / 0 === Math.sqrt(-1). They're not even the same class of answer.

0 / 0 is indeterminate, meaning (in short) that there isn't enough information provided in the calculation to arrive at a single result (or a finite number of results). Practically anything could be considered a "correct" answer to the question, "What is zero of this divided into zero parts?" Depending on context, there could be lots of different answers that make a little bit of sense. Let's just ask the question a different way: "Zero multiplied by what is equal to zero?" 3 is a "correct" answer to this question, but it's no more correct than the answer 5 or -21. Therefore, we call the answer to this question indeterminate, meaning we can't determine an answer to the question.

On the other hand, Math.sqrt(-1) is undefined, meaning we just don't have any value available in our system that can answer this question. However, from algebra we remember that the square root of -1 is i. The complex numbers provide a system for answering these kinds of questions, but JavaScript doesn't have complex numbers, so it can't answer the question. 3 isn't a correct answer. Neither is 5 or -21.

But even though 0 / 0 and Math.sqrt(-1) are very different kinds of unanswerable problems, IEEE 754 provides NaN as an answer to both of them. However, since they aren't really the same answer, JS will return false if you do 0 / 0 === Math.sqrt(-1) or even NaN === NaN for that matter.

So from a mathematical perspective, it does make sense that NaN !== NaN. And it helps prevent one kind of bug (where NaN is the result of mathematical operations). But in other contexts that can be confusing and result in a different kind of bug (where programmers are specifically interested in identifying NaN -- and perhaps as part of an abstract operation where certain parts of the code don't know what kind of comparison you're trying to do).

Two kinds of questions

There are two kinds of questions a programmer might want to ask a computer about NaN:

  1. "Should the result of this operation be considered equal to the result of that operation?"
  2. "Is the result of this operation NaN?"

By default, ECMAScript 5 answers the first question, while providing the isNaN function to answer the second question. In ECMAScript 6, the SameValueZero internal operation has been added, and is being used in various places to answer the second question. For instance, Set.prototype.has will use SameValueZero, which considers NaN equal to NaN. So set.has(NaN) can return true if the set has a NaN value in it. (An Array.prototype.has or Array.prototype.contains has been discussed that would behave the same way, though it doesn't appear to be in the draft yet.)

ECMAScript is not actually tuning the way it answers the question for each individual API based on what it thinks the programmer is asking. Rather, TC39 seems to have decided that most people don't ask question 1 but instead as question 2, and therefore future APIs should answer question 2.

Proposal: Direct and Computed NaN

Instead of trying to choose which question to answer all the time or based on the API used, introduce two different NaN values and answer the question based on which value is used.

  1. Direct NaN would be the value you get by typing NaN.
  2. Computed NaN would be the value you get by performing a computation that results in NaN.

Rule: In a comparison between two values, if either value is direct NaN the second question should be answered. If both values are computed NaN, the first question should be answered.

Therefore, in Proto 0 / 0 == (-1) ^ (1 / 2) would be false but 0 / 0 == NaN would be true and (-1) ^ (1 / 2) == NaN would be true. (That would mean equality wouldn't be transitive when NaN was involved.)

This makes sense because if the programmer types NaN his intention is "any NaN" while if he computes NaN he most likely means "this NaN".

// Does the array contain any NaN?
array.indexOf(NaN);

// Does the array contain this value?
array.indexOf(x / y);?

In the second example above, if x and y happen to both be 0 then the indexOf would always result in -1, even if the Array contains a NaN (even if it contains a NaN that was the result of 0 / 0). This is because two computed NaNs should not be considered equal to each other.

However, the first example would result in an index if any NaN is in the array because the programmer is clearly asking if there is a NaN in the array.

This should allow a lot of flexibility, and I think in the vast majority of cases would just work, reducing bugs.

In order to make generating a computed NaN easy for use in returning computed NaN values from functions, any arithmetic operation on a direct NaN should result in a computed NaN.

function foo(a, b) {
    // This operation is not defined when a=5 and b=6
    if (a == 5 && b = 6)
        // Computed NaN should be returned, not direct NaN
        return NaN + 0;
    return bar(a * 2, b / 5);
}