tc39 / proposal-record-tuple

ECMAScript proposal for the Record and Tuple value types. | Stage 2: it will change!
https://tc39.es/proposal-record-tuple/
2.5k stars 62 forks source link

Equality semantics for `-0` and `NaN` #65

Closed bakkot closed 2 years ago

bakkot commented 5 years ago

What should each of the following evaluate to?

#[+0] == #[-0];

#[+0] === #[-0];

Object.is(#[+0], #[-0]);

#[NaN] == #[NaN];

#[NaN] === #[NaN];

Object.is(#[NaN], #[NaN]);

(For context, this is non-obvious because +0 === -0 is true, Object.is(+0, -0) is false, NaN === NaN is false, and Object.is(NaN, NaN) is true.)

Personally I lean towards the -0 cases all being false and the NaN cases all being true, so that the unusual equality semantics of -0 and NaN do not propagate to the new kinds of objects being introduced by this proposal.

bakkot commented 4 years ago

they just might be inside the tuples you're comparing

So... there are more than three values, because those checks return true instead of false when x or y are values other than 0, -0, or NaN. Right?

If it helps, a lot of people don't think of these structures as values unto themselves, but rather "groups of values"

The whole point is that we are reifying "groups of values " into values. A record is a value; that is the whole point of it.

devsnek commented 4 years ago

@bakkot sorry to clarify: if you think of a proxy, it is its own value, but it derives its existence from something else. when you're talking about a proxy you're normally talking about its handlers or its target, not the proxy instance itself. in a similar way tuples, at least in other programming languages, are less about the tuple instance itself and more about what it contains. i think it's unintuitive for a lot of people to even think of a tuple as having behaviour.

papb commented 4 years ago

I want to be unambiguous about this concept of "the same value": If two things are reliably distinguishable at all (e.g., through Object.is), I don't see them as the same "whole value".

@littledan I am still confused :sweat_smile: we haven't decided how Object.is will behave, so this seems cyclic reasoning to me.

papb commented 4 years ago

If it helps, a lot of people don't think of these structures as values unto themselves, but rather "groups of values"

@devsnek I think this thought is dangerous, and can lead to lots of confusion. A value should be nothing more nothing less than something that can be put in a variable. Like an object is a value, a record must be a value too. That thought (apparently had by a lot of people) is unhelpful, I think.

papb commented 4 years ago

In any case, I think adding an infinite set of values for which === and Object.is disagree, where currently there are exactly three such values, would complicate the operator considerably.

@bakkot What do you mean by "complicate the operator considerably"?

As you pointed out, with my suggestion, indeed x !== x would no longer imply Object.is(x, NaN). Instead, it would imply that x is either NaN or a record/tuple containing some (possibly deeply-nested) NaN.

At first, this seemed a bad idea to me, but giving it a chance and thinking more carefully, I don't think it is bad. In fact I now think it is the correct behavior. Just like IEEE 754 had good reasons to define NaN !== NaN, if you consider the example of a record representing a coordinate, the same good reasons are applicable to make a case for #{ x: 10, y: NaN } !== #{ x: 10, y: NaN }.

littledan commented 4 years ago

@papb There's only one plausible definition of Object.is for Records and Tuples--it recursively compares them for Object.is. That's not the topic of this issue.

papb commented 4 years ago

I would also like to rephrase another great point raised by @jridgewell that also favors my reasoning:

If we allow #[NaN] === #[NaN] to be true, then it will be possible to have two variables x and y which satisfy x === y but don't satisfy x[0] === y[0]. To me, this alone is so crazy that I don't even need my own previous arguments :sweat_smile: I am convinced by this one alone.

papb commented 4 years ago

@papb There's only one plausible definition of Object.is for Records and Tuples--it recursively compares them for Object.is. That's not the topic of this issue.

@littledan Ah, great!! Sorry about that then. I'm happy to hear that. I thought it was also the topic of this issue because Object.is(#[+0], #[-0]) and Object.is(#[NaN], #[NaN]) are mentioned in the first post.

bakkot commented 4 years ago

Object.is(#[+0], #[-0]) is open because it's plausible that we could normalize -0 to 0, as we do for Set and Map. Object.is(#[NaN], #[NaN]) is just there for completeness.

littledan commented 4 years ago

Ah, right, but I guess that's not the definition of Object.is but rather #[-0].

devsnek commented 4 years ago

@papb yeah, saying it is not like a value was imprecise and hopefully my clarification helped.

papb commented 4 years ago

@bakkot @littledan Ah, ok. About #[-0], my opinion is that it shouldn't be normalized, based on the issue raised by @Zarel in this comment. Regarding precedent from Map and Set: I think Map and Set are distinct enough from Records and Tuples to make Zarel's argument have more weight.

ljharb commented 4 years ago

@papb

const x = { get '0'() { return Math.random(); } };
const y = x;
console.log(x === y, x[0] === y[0])

that said, non-idempotent getters themselves are pretty strange :-) but it is already possible.

jridgewell commented 4 years ago

that said, non-idempotent getters themselves are pretty strange :-) but it is already possible.

For values of non-configurable, non-writable data descriptors, this will be unique.

ljharb commented 4 years ago

of course, there's also:

const x = Object.freeze({ '0': NaN });
const y = x;
console.log(x === y, x[0] === y[0])

:-p

bakkot commented 4 years ago

@papb As @ljharb points out,

it will be possible to have two variables x and y which satisfy x === y but don't satisfy x[0] === y[0].

is already true - just let x be an array whose first element is NaN (and y be x). That this would still be true when x is a tuple rather than an array doesn't (I would think) add any new surprises to the language, whereas adding an infinite set of new values for which === and Object.is differed would, in my estimation.

devsnek commented 4 years ago

Maybe we can add Object.strictEquals so we can still compare the structures correctly even if === is broken.

littledan commented 4 years ago

@devsnek Leaving aside how we should define ===, if this were standard library functionality, I don't really understand why element-wise === is any more common/important of an operation than element-wise < or + (which certainly have their own use cases).

devsnek commented 4 years ago

@littledan I'm confused, are you saying we shouldn't let === work on these structures at all?

littledan commented 4 years ago

@devsnek Sorry for being unclear; I'm still pushing for === having Object.is semantics on Records and Tuples. I mean, I don't see why element-wise IEEE754-style equality comparison on Records and Tuples is more common/important than element-wise < or +.

devsnek commented 4 years ago

@littledan because the === operator already does that kind of comparison on numbers. if we had an operator for Object.is semantics (maybe adding that could solve this dispute) it would be fine for it to use Object.is semantics on the elements.

devsnek commented 4 years ago

As for all the other various operators, maybe they're worth discussing, but this issue is about what === should do so I don't see much point in debating the other operators.

bakkot commented 4 years ago

A couple of data points:

In Java's value ("inline") types proposal (Project Valhalla), two value types containing NaN are considered to be == equal (as described here, search for "equality" or "the legacy behavior of NaN"). And in their Records proposal, I believe .equals will use Double.compare for double fields, which means two records with fields holding NaN are considered to be .equals.

In Python the built-in data structures are allowed to assume their elements are reflexive, which means

>>> n = float("nan")
>>> (0, n) == (0, n)
True

(though (0, float("nan")) == (0, float("nan")) may be False).

devsnek commented 4 years ago

why does factoring nan into a variable change how == works

bakkot commented 4 years ago

why does factoring nan into a variable change how == works

It doesn't; n == n is False in my above example.

It is specifically when comparing tuples containing NaN that you can get True, exactly as is being discussed here.

devsnek commented 4 years ago

you said that both n = float('nan'); (0, n) == (0, n) and (0, float('nan')) != (0, float('nan')).

bakkot commented 4 years ago

float('nan') (typically) gives you a "fresh" NaN, such that float("nan") is float("nan") is (typically) False. But n = float('nan'); n is n is always True.

devsnek commented 4 years ago

got it. it seems that python also treats -0.0 and 0.0 as equal, but can tell them apart using the is operator.

erights commented 4 years ago

it is against IEEE 754 to do so, and some people will argue that preventing that equality can halt forward progress as NaN intends

If NaN === NaN produced a NaB (not a boolean) or threw an error, or even went into an infinite loop, that would prevent progress. Instead it returns false. If NaN conceptually means "we don't/can't know what number this is supposed to be" then returning false is just as bad as returning true. In fact it is still worse because even for such an unknown, we'd know the reflexive case x === x would be true even if we don't/can't know what x is.

Obviously we're not going to fix IEEE. But I do not accept that breaking reflexive equality was a good idea, even just within the domain of arithmetic.

devsnek commented 4 years ago

@erights yeah i don't care about nan as much, as it doesn't really have bearing on the correctness of successful numeric operations. my argument there was more about staying consistent with the operator being used.

erights commented 4 years ago

At https://github.com/tc39/proposal-record-tuple/issues/65#issuecomment-633751364 @Zarel offer the example:

const coord = #{x: 1, y: -3};
const coord2 = #{x: coord.x / 1e350, y: coord.y / 1e350};

const isBelowOrigin = coord2.y < 0 || Object.is(coord2.y, -0);

I don't understand what this is supposed to be an example of. It resorts to Object.is to reveal whether coord2.y is -0. But Object.is is not an arithmetic operator. It does not exist, for example, in IEEE. Is there an arithmetic, numerical example of the utility of -0? I'm sure there must be. But the only within-IEEE observable consequence I know of the difference is that, for example, 1/0 === Infinity and 1/-0 === -Infinity. If you've already fallen off the precision limits so far as to reach an infinity, especially if it is because you divided by some zero, you've probably already lost anyway and need to rewrite your algorithm.

What within-IEEE-arithmetic examples are there of useful calculations without infinities that go awry if a -0 were normalized to a 0?

erights commented 4 years ago

What within-IEEE-arithmetic examples are there of useful calculations without infinities that go awry if a -0 were normalized to a 0?

I mean any -0. The example need not have anything to do with records and tuples. I am trying to understand what is it about numeric computations that motivates the two zeros, and what is lost if the distinction were collapsed.

Again, obviously, we are not going to fix IEEE. But I don't understand this and would like to before continuing the records and tuples debate.

Zarel commented 4 years ago

Is there an arithmetic, numerical example of the utility of -0?

As far as I'm aware, -0 and 0 only differ in two ways:

  1. Object.is can tell them apart.
  2. 1/0 is Infinity, while 1/-0 is -Infinity (and so on for other numerators).

The latter is arithmetic and numerical. You might think about it in the context of:

const otherFrame = #{
  time: -1 / 1e350,
  distance: -1,
};
const movedForwards = otherFrame.distance / otherFrame.time > 0;

movedForwards should be true, but if you normalized -0 to 0, movedForwards would be false.

devsnek commented 4 years ago

I am trying to understand what is it about numeric computations that motivates the two zeros, and what is lost if the distinction were collapsed.

Very small negative numbers underflow to -0 because if they underflowed to +0 they would have the wrong sign when you tried to extrapolate further calculations or decisions from them. Additionally, multiplying 0 by a negative number results in -0. I'm not sure why that is but it is common enough (see above examples) that its definitely a foot gun if people have to explicitly watch out for it.

erights commented 4 years ago

@devsnek do you agree with @Zarel 's assessment that it can only make an arithmetic difference if things explode to infinity? That, infinities aside, there cannot be any arithmetic difference? If not, could you show an arithmetic example not involving infinities?

devsnek commented 4 years ago

@erights yes that sounds right. its worth noting though you might not be explicitly using infinity or dividing by zero, but something could have overflowed.

erights commented 4 years ago

I gotta say, if the only cost on the arithmetic side is on the sign of infinities, I'm leaning ever farther towards normalizing -0 to 0 in records and tuples.

How important are the signs of infinities to actual useful numeric algorithms in practice?

devsnek commented 4 years ago

I can't imagine it comes up that often but I think js is enough of a general purpose language that we can assume someone will run into that.

littledan commented 4 years ago

I'm not yet sold on normalizing -0 to 0. I would prefer if we could let all primitives be represented in Records and Tuples unscathed. Otherwise I worry this becomes yet another case for people to think about and consider whether it affects them. Users can always choose to normalize -0 to 0 themselves when constructing a Record or Tuple if that is what they want.

kfdf commented 4 years ago

As a novice javascripter who has stumbled upon this thread while being on a lookout for "how do they do this thing in javascript?" my initial thought was that any case where #[x] === #[x] is inconsistent with x === x would be a massive gotcha. But thinking of it, you can look at it from another perspective: it is not that Records and Tuples introduce more inconsitencies to the language, it's just -0 and NaN have another place to cause mischief, like they do wherever they show up... equal here, not equal there, and and even flipping signs at times... One problem I see if === works as Object.is is that even if you don't do any "fancy" calculations (and I think the absolute majority of people don't) -0 can still easily show up and break things. The principle of least surprise for the absolute majority suggests to normalize -0 to 0. And those few who do expect things like -Infinity to show up in their code, they have to know what they are doing anyway.

littledan commented 4 years ago

Why should -0 be normalized to 0 when placed in Records and Tuples, but not elsewhere in the language? I can understand how IEEE754 is confusing and not the best choice for the JS's long-time only numeric type (that's part of why I worked on BigInt and now Decimal), but I don't understand why "putting a value in a Record or Tuple" is the place we should intervene and "fix" things, any more than we could've decided something like, if you assign a Number to a let or const-bound variable, then -0 is normalized to 0.

kfdf commented 4 years ago

It was "fixed" in Maps and Sets. Everywhere else it "just works" thanks to how === works. So it's either recursive === or, if it precludes some important optimizations, normalizing would be the second best choice. That would be consistent with the expectations of a layman coder, otherwise, it is possible to get something like this:

function flipIfNegative(p) {
    return #{ 
        row: p.row > 0 ? p.row : -p.row, 
        col: p.col > 0 ? p.col : -p.col 
    };
}
flipIfNegative(#{ row: 5, col: 0 }) === #{ row: 5, col: 0 } // false, what the...?

The code is sloppy, but doesn't deserve to become bugged. If -0 appeared in "exotic" scenarios only, it wouldn't be much of a problem, but it can appear even if you treat numbers as integers.

papb commented 4 years ago

I don't think Map and Set deserve to be considered precedent for any decision here. The keys of Maps and elements of Sets have immense usage differences from a direct value in a structure. If you have foo = #{ x: -0 }, then you can directly ask yourself about foo.x. It should be -0. That's what it is!! I really don't like this normalizing idea. In Map and Set it is completely different, one does not simply access one key / element directly. They all become part of a larger abstract representation. It's very different.

papb commented 4 years ago

Thank you @Zarel for creating yet another example of how normalizing is bad!! @icefoxen also made a good job explaining/defending some of these reasons.

I would like to reiterate: IEEE754 has good reasons to do what they did. And even if you happen to disagree with that, """"fixing"""" it in only a specific part of the language will create a huge mess.

Imagine people refactoring all their objects into Records? If -0 is normalized to 0, then they won't, because they just can't.

papb commented 4 years ago

The code is sloppy, but doesn't deserve to become bugged.

@kfdf I disagree. Although this code not being bugged would be good on its own, that would be against many other issues I and others have raised. And since one sloppy code is much more fixable and understandable behavior, I believe your example code being bugged is a completely acceptable "price to pay".

papb commented 4 years ago

@erights Please take a look at this question in stack overflow: https://softwareengineering.stackexchange.com/questions/280648/why-is-negative-zero-important (in particular, the accepted answer)

kfdf commented 4 years ago

I don't think Map and Set deserve to be considered precedent for any decision here

It's not just them, what would be the most straightforward way to imitate Records and Tuples right now? String concatenation, or even JSON.stringify. But -0 doesn't survive it: Object.is(0, Number(String(-0))); Object.is(0, JSON.parse(JSON.stringify(-0))); So -0 normalization is nothing unusual, perhaps not very explicit.

rickbutton commented 4 years ago

I gotta say, if the only cost on the arithmetic side is on the sign of infinities, I'm leaning ever farther towards normalizing -0 to 0 in records and tuples.

@erights I don't think I'm in favor of normalizing -0 to 0. If the choice is between "Record and Tuple can store all primitives except for -0" and "Records and Tuples containing -0 and 0 are not equal", I would choose the latter.

papb commented 4 years ago

It's not just them, what would be the most straightforward way to imitate Records and Tuples right now? String concatenation, or even JSON.stringify. But -0 doesn't survive it: Object.is(0, Number(String(-0))); Object.is(0, JSON.parse(JSON.stringify(-0))); So -0 normalization is nothing unusual, perhaps not very explicit.

@kfdf I do not follow your reasoning.

[...] what would be the most straightforward way to imitate Records and Tuples right now?

Objects and arrays?

papb commented 4 years ago

By the way, my opinion so far has been built in the grounds of what seems less surprising, more consistent and more useful. However, I have not considered the optimization aspect that is mentioned in the readme overview:

Additionally, a core goal of this proposal is to give JavaScript engines the capability to implement the same kinds of optimizations for this feature as libraries for functional data structures.

Does the decision on equality semantics for -0 and NaN impact this aspect of the proposal? If yes, how? Where can I learn more about what optimization aspects could be affected?