tc39 / proposal-bigint

Arbitrary precision integers in JavaScript
https://tc39.github.io/proposal-bigint
561 stars 57 forks source link

Regarding "Don't break user intuition" #30

Closed gweax closed 7 years ago

gweax commented 7 years ago

The proposal states that "When a messy situation comes up, this proposal errs on the side of throwing an exception rather than silently giving a bad answer."

To me this is exactly the opposite of what my intuition says. The forgiving nature of JavaScript is to me an integral part of the language. Throwing an error seems so Java-ish. Especially when using the + operator. Although I understand the rationale behind it, I think the plus operator should never throw an error, because that's the way it currently is. Doing otherwise would break user intuition.

Axel Rauschmayer's proposal is in my opinion more consistant with regard to user intuition.

bakkot commented 7 years ago

The problem is, what type should e.g. 9007199254740993n + 0.1 be? (9007199254740993 is 2^53 + 1, for reference, which is not representable as a JS number.) Either choice looses precision.

That said, I'd maybe support implicit conversion to Integer when one operand is an Integer and the other is a number ≤ 2**53 with no decimal part (throwing when there is a decimal part). I'm not sure if this sort of conditional throwing is a good idea from an avoiding-bugs or implementation perspective, though.

(@rauschma's proposal looks like it changes the semantics of existing JS math, e.g. 2**53 + 1 === 2**53 would no longer hold, which I'm almost certain we can't do.)

gweax commented 7 years ago

I do understand the dilemma here. However, I'd rather lose precision than throw an error. Losing precision is something we already deal with IEEE 754, so it doesn't seem bad to me.

@rauschma's proposal is worth to be discussed, but this is not the place :-)

bakkot commented 7 years ago

However, I'd rather lose precision than throw an error.

In which direction, though? If there were a single obvious choice I'd agree, but I'm not convinced there is.

I think in cases where there's two highly conflicting intuitions for the semantics of some code, and it's possible to just make that code an error and instead require more explicit expressions of intent, it's better to make it an error. (For example, this is why -1**2 is a syntax error.)

littledan commented 7 years ago

@gweax If users are OK with losing precision, then they can stick with Number, and maybe pursuing this topic is a bad idea. But the idea here is that many users want to get greater precision, and this is a proposal towards that use case. Implicitly losing precision seems like the worst thing to do for these users, doesn't it?

gweax commented 7 years ago

Maybe this is a general issue of where JavaScript is heading to. It seems to me that currently steps are taken to change (or reduce) the nature of a loosely typed language. A part of that is implicitly coercing one type to another if the expected types don't match the actual ones.

Of course this lead to silly and sometimes inconsistent rules like "12" + 3 and "12" - 3, which is the source of much mockery of JavaScript. On the other hand, this fulfills the robustness principle (Postel's law) of "be liberal in what you accept from others".

Introducing a new data type that leads to runtime errors (in contrast to the syntax error of -1**2, which can be caught by linters) when used with an operator which until now did not throw an error (under no circumstances) feels like a bad idea. If this is a must towards a more sensible handling of loose types, then be it. I'd much more prefer a way that does not change the fundamental approach of JavaScript as I see it.

My reasoning lacks an alternative to how to cope with Integers and other types. This is because I'm not too much into the matter. If for the expected use case there is no other solution than to throw errors, well, then we have to go with it.

bakkot commented 7 years ago

used with an operator which until now did not throw an error (under no circumstances)

Pedantically:

Symbol() + 1; // throws
({ valueOf() { throw new Error('unsupported'); } }) + 1; // throws

So it's not a totally radical proposition, especially in es2015-land. Es2015 also tightens the type system in other ways, like making classes non-callable and generators non-constructible.

But it's true that this would be a step further in that direction, especially since Integers, unlike Symbols, do have a sensible interpretation when added to floats. (In fact they have two sensible interpretations, which conflict.) I think this is worth it: I think it will prevent far more bugs than it causes, without making it that much harder to write code.

ljharb commented 7 years ago

This would only prevent bugs when mathing numbers and "integers larger than 2**53" - how common do we think this is among users who know the difference and care about it?

bakkot commented 7 years ago

@ljharb, I can't tell what your "this" is referring to.

In any case, I suspect that most users of Integers will want accurate math when operating on Integers larger than 2**53 - otherwise they wouldn't use them.

ljharb commented 7 years ago

Right - I'm saying that very few users will be using integers that large and performing math ops with those integers and floats, and those rare users will be quite aware of the precision issues involved.

In other words, I'd be fine with only allowing integer conversion when it didn't drop precision; but lacking that, I think dropping precision is much better than throwing, since the extremely common case will not drop precision.

gweax commented 7 years ago

In fact they have two sensible interpretations, which conflict.

@bakkot This is also the case with other types and the plus operator. "12" + 3 has the sensible interpretation of "123" and 15. However, instead of throwing an error, it was chosen to prefer one interpretation over the other.

The more I think about it, the more I tend to say that Integer × <any other type> should convert the other type to an Integer.

Given, for example, an implementation of the Fibonacci numbers, a function written for "traditional" JavaScript could be something like that (with awful performance):

function fib(n) {
  if (n === 0 || n === 1) return n;
  return fib(n - 1) + fib(n - 2);
}

With Integer + Number converting the Number to an Integer, this function would work fine without any adaption. Not so with Integer + Number throwing an error (as at the end the Number 0 and 1 get added).

Existing code could benefit from implicit coercion, and this is a strong argument against throwing.

kgryte commented 7 years ago

The more I think about it, the more I tend to say that Integer × should convert the other type to an Integer.

@gweax What you propose is the exact opposite of what other environments do. For instance, in Python,

>>> x1 = int( 1 )
>>> type( x1 )
<type 'int'>
>>> x2 = 1.0
>>> type( x2 )
<type 'float'>
>>> x3 = x1 + x2
2.0
>>> type( x3 )
<type 'float'>

In Julia,

julia> x1 = 1;
1

julia> typeof( x1 );
Int64

julia> x2 = 1.0;
1.0

julia> typeof( x2 );
Float64

julia> x3 = x1 + x2;
2.0

julia> typeof( x3 );
Float64

The same type promotion rules apply to Java, C, and Perl.

Were JavaScript to pursue the path you proposed, porting numeric computing code would require considerably more work to ensure type conversions were handled in accordance with existing expectations.

The above, however, is not an argument against the possibility of implicit conversion, but were any conversion to occur the path would be from integer to double, not the other way around. For some operations, losing precision can be acceptable (e.g., + and -), but, for other operators, certain situations require throwing (e.g., domain errors of the variety integer(2)**(-integer(2))).

FWIW, in Julia,

julia> x = 9007199254740993
9007199254740993

julia> x + 1.0
9.007199254740992e15
gweax commented 7 years ago

This is very interesting for two reasons.

1) Porting implementations from other languages to JavaScript will likely be a major use case, when Integers are available. Pursuing the way of other languages therefore seems very reasonable.

2) As far as I understand this from your examples, these other languages don't throw, at least not for + and -. If JavaScript behaved differently, it would harm porting numeric code.

littledan commented 7 years ago

In C/C++, there are all these implicit casts in numeric operators and argument passing. However, they have been so error-prone in the past that many projects run with a compiler mode to reject programs which don't do explicit casts rather than depending on the implicit ones. I think many programs that people will be porting will already be uniform in this sense, and many of the ones that are not have bugs due to losing precision.

mihailik commented 7 years ago

Porting of C++/Java/Python code is cool, but it's 1000x more LOC of existing JavaScript that should be the priority.

Existing code should react to Integer inputs in predictable, smooth manner.

calculateCompoundReturn( 5n )

^^^^ If this works for 5, it should not crash for 5n. Otherwise you effectively designate Integer data type as a killer bomb to sink any existing complex code.

We have business logic, complicated rendering calculation, extensive well-tuned libraries like jQuery, d3, three.js and more. If integer throws, if it is a poisoned apple, maintainers of all those libraries will have a big trouble on their hands.

littledan commented 7 years ago

@mihailik How does this case differ from passing a Symbol into such libraries?

mihailik commented 7 years ago

Excellent point, and there is a good reason too.

It's highly improbable for Symbol to be misplaced into a string/number position. It's highly probable for Integer to be misplaced into a number position.

ljharb commented 7 years ago

It differs because conceptually, a Symbol isn't the same as a number, but conceptually, an integer is very much the same.

littledan commented 7 years ago

OK, I can see how users would want Integers to behave this way. I can also see how users would want Numbers to automatically overflow into Integers. The problem, though, is how and whether we can do it while preserving other important properties. I am having trouble thinking of a way, but maybe you two have more ideas. At some point, limitations of language design and backwards compatibility seep in and make the union of all user intuitions untenable. I tried to document the issues with mixed operands in the explainer, but maybe the story I told there was missing something.

ljharb commented 7 years ago

I think you documented it well, but I think the conclusion - that throwing when mathing between two number types is acceptable - is where we differ; I think that lacking a way, integers are untenable.

ConorOBrien-Foxx commented 7 years ago

Kind of an odd idea, but like we have "use strict";, we could have something like "use integer"; for the problems related to automatic promotion and backwards-compatibility. integer";` for the problems related to automatic promotion and backwards-compatibility.

MMeent commented 7 years ago

As JSON allows for 3 different notations (s/-?d+/, s/-?d+.d+, and s/-?d+.d+e-?d+, can't we distinguish between those notations and use Number/Integer in the corresponding locations?
As in, use s/-?d+/ for integer (it is a strict integer), and the rest for Number as they are using a point notation?

I'd see that as a valid solution, and reverseable when Numbers (even when isInteger() is true) use dot notation when serializing (e.g. JSON.serialize(1) === '1.0').

bakkot commented 7 years ago

@mihailik , @ljharb

If integer throws, if it is a poisoned apple, maintainers of all those libraries will have a big trouble on their hands.

Strongly disagree: Throwing in these cases is much preferable to silently having the wrong semantics.

Remember, for the error to be triggered, someone has to write the code which passes in the BigInteger. Even if the conceptual similarity between numbers and BigIntegers leads them to expect that to work, they will immediately find out that it doesn't. I don't understand why you expect it to be such an issue. (Though I guess it would be nice to hear thoughts from library maintainers too.)

This doesn't seem to be that much of an issue in other languages: In Go, for example, it's a compile error to multiply a big.Int and an int. I don't think throwing a type error as soon as possible is that much less user friendly.

Really I expect that BigIntegers will see two kinds of users: a.) people who actually have a use for them, who I do not expect to be surprised or confused by the throwiness, and b.) people who are just trying them out, who may be surprised by the throwiness but who I expect to then conclude that BigIntegers are not something they want to use. Which is the desired effect.

mihailik commented 7 years ago

If Integers are explicitly intended for niche usage, they should live in a library, and not surface as 1st class syntax. Compare with typed arrays.

rwaldron commented 7 years ago

@littledan @bakkot

Throwing in these cases is much preferable to silently having the wrong semantics.

Possibly uncommon first hand experience: when implementing hardware drivers in JS, implicit coercion (strictly in the Integer/Number operation sense) can have dangerous (as in physical harm) consequences. I'd prefer the program to throw over the alternative.

@mihailik

If Integers are explicitly intended for niche usage, they should live in a library, and not surface as 1st class syntax. Compare with typed arrays.

Are you inferring that from @bakkot's message here:

Really I expect that BigIntegers will see two kinds of users: a.) people who actually have a use for them, who I do not expect to be surprised or confused by the throwiness

I don't think the intention was to make it appear that BigInteger was for "niche usage", simply that they have a concrete role to fill.

mihailik commented 7 years ago

@rwaldron storing guids, storing stat bit flags (see the proposal) is quite a generic liberal usage.

Specifically, bit flags are very prone to be brittle in regards of pass/throw. They are used with a wide variety of operations: ==, ===, if, ?, >, &, |.

Producing a syntax that encourages wide generic usage, but leads to brittleness would both impose heavy costs of broken compatibility, and lead to feature being dormant next to 0-octals.

littledan commented 7 years ago

@ConorOBrien-Foxx About introducing a language mode: TC39 considered, in the ES6 cycle, restricting features to certain modes. However, @dherman made the case that we should restrict reference to modes, instead promoting a rallying cry of "1JS!", a single language which may be more complicated for implementers and spec writers, but a simpler mental model for users.

littledan commented 7 years ago

@mihailik I think 0-octals are a bit different--the really problematic thing about them is that if you include an 8 or a 9, it silently falls back to non-octal. I'd bucket this as a big "oops, that really never helped anyone in the first place, and it seems like an accident of history", together with divergence in line endings among platforms. By contrast, this is an explicit design decision, and it's in the direction of being more strict rather than more loosey-goosey as 0-octals are. By the way, I wish 0-octals were out of use--then we could remove the feature.

@MMeent I think we could do something like that if we were starting over with a new language. asm.js makes such a distinction, actually, but in a way such that it will only reject programs with the wrong/missing decimal, not actually change runtime semantics. Unfortunately, it's likely to change semantics of existing programs to decide whether or not to make something an Integer or Number. For example, in Python, 1/2 is 0 (rounding down, as it's an integer calculation), and 1.0/2 is 0.5. I'd expect plenty of websites to break if we made 1/2 start evaluating to 0.


I just want to mention, all of these issues come up in exactly the same way for Int64 (use cases that this proposal attempts to meet, but which has previously been proposed separately). If we think Integer is untenable due to them, that means we are shutting all uses of JavaScript out from what's been a frequent user request from Node, embedded use cases, or authors of several libraries which need to do complex calculations.

rwaldron commented 7 years ago

@littledan I don't think integer is untenable at all. The proposed semantics are sound and may still be relaxed in the future if there is sufficient evidence that throwing is undesirable. (Although, I predict that will not be the case.)

mihailik commented 7 years ago

@littledan the current TC39 proposal suggests wide common usages of Integer: for storing large unique identifiers, and for storing bit flags.

These use cases are at the top of the README.

The idea that this proposal can only apply to code dealing with 2^53-sized numbers quite simply contradicts the spec's stated goals. It's not the browsers' maintainers we should worry about — it's about the massive amount of JS code out there.

Just as with zero-octals, you can easily slip on a single character == versus ===. And just as with zero-octals, it's a 1st class language syntax which is quite hard to roll back.

Again, it's not about how hard it is to implement the new feature, not how easy it is to write the new fancy code. It's how bad it is for the existing code. And remember, you're not in writer/consumer world. TypeErrors are fired in the face of end user, due to one value somewhere falling out of the previously tested range. Off by one and you're out (think bit flags issues I mentioned earlier).

littledan commented 7 years ago

Just for some context, I'm not making arguments here for implementation complexity. The implementation of Integer will be complex, no bones about it. All of the arguments I've been making here are about language consistency, users getting the right answers, and being open to future evolution.

Generally, as a design principle, I think when a user slips on a character and writes the wrong thing, the best case scenario would be for the language to throw an exception as often as possible (rather than intuit what might be the right answer) so that the error can be caught earlier in the development cycle. If we can throw an early error, that's the best. If we can throw a TypeError each time that code is reached, well, better than silently getting the wrong answer. I don't know how to solve the problem of users writing code that doesn't work--maybe systems like TypeScript can help detect these cases.

gweax commented 7 years ago

Generally, as a design principle, [...] the best case scenario would be for the language to throw an exception as often as possible [...]. If we can throw a TypeError each time that code is reached, well, better than silently getting the wrong answer.

I do agree. But that's not the way JavaScript was designed in the beginning. There seems to be a consent that this was a design mistake, thus we have Douglas Crockford's "Good Parts", which can be condensed into one statement "Don't use implicit type coercion".

Throwing an exception is another way to deal with it - maybe following a reasonable design principle, but breaking the design principle of JavaScript's old days. Is there a general consent that this is the way JavaScript should go?

littledan commented 7 years ago

@gweax I don't know if there's general consensus about anything in this area. Certainly the "don't use implicit type coercion" thesis has its detractors. But ES2015 has definitely been moving in this direction, as @bakkot pointed out, and throwing in these contexts was well-received at TC39.

Jamesernator commented 7 years ago

Another option would be to make it so that Numbers in the safe integer range get coerced to Integer when being added to an Integer and all other Number values throw an error so that loss of precision can never occur but operations still make sense:

e.g.

1n + 1 == 1n
1n * 2 == 2n
1n - 1000 == -999n
1n + 1.5 // Throws
1n + 1e16 // Throws
1n + NaN // Throws
1n + Infinity // Throws
ljharb commented 7 years ago

@Jamesernator that has been suggested here and some counterarguments are here and here

medikoo commented 7 years ago

Maybe this is a general issue of where JavaScript is heading to. It seems to me that currently steps are taken to change (or reduce) the nature of a loosely typed language.

And we can already spot some side effects of that: https://github.com/nodejs/node/issues/11637

If integer will be handled same, it's likely we'll end with tons of similar bugs (as someone already noted it's way more probable to collide number and interger than symbol and string).

ljharb commented 7 years ago

@medikoo that's a side effect of the core code making the forever-false assumption that you can safely stringify a value - { toString: function() { throw 'nope' } } would have thrown for 20 years now. Making new builtins more constrained doesn't mean that code relying on broken assumptions is suddenly brittle - it's in fact always been brittle.

medikoo commented 7 years ago

@medikoo that's a side effect of the core code making the forever-false assumption that you can safely stringify a value

I agree it's not the best example, as indeed at such level (and in ES5+ env) we should never assume that value is coercible to string.

bakkot commented 7 years ago

@ljharb and in fact I suspect node is still broken for that case.

mihailik commented 7 years ago

It's nothing to do with WHERE js is going, it's all about how it chooses to travel.

Every household has knives aplenty. If you set your mind to it, you can injure yourself badly. The question is how to make it predictably very hard to cut yourself.

Removing knife's handle is bad. You can argue the blade pre-existed. But the point stands: you KNOW people will open the drawer, pick the knife and bleed. Normal people, not nerdy compiler writers.

ljharb commented 7 years ago

Perhaps a better analogy than removing the handle would be, covering the blade in a piece of tissue paper (so that its easy to not realize that picking it up by the wrapped blade could still cut you later), versus removing the tissue paper so that the only safe way to hold it is obviously the handle. Adding these exceptions means you're more likely to get cut, but only when you're doing the wrong thing - which is the time you're supposed to get cut, so you learn to stop picking up knives by the blade.

mihailik commented 7 years ago

People have been using numerical values in JS for years. No, decades. Your know they will cut themselves with integers.

The INTENDED USE CASES for integer and number types overlap significantly. Not by mistake, BY DESIGN they're going to mix and inevitably blow up. Read up the proposal, it's a general purpose feature.

This proposal creates bugs, it definitely fixes no pre-existing bugs.

littledan commented 7 years ago

@mihailik What you're saying isn't unreasonable--maybe this proposal is fatally un-ergonomic, and there's just no good way to do it, and we therefore should never expose other numerical types in JavaScript.

I think this hazard is something to watch out for as this proposal advances. We need more data. What about this--we'll get this implemented in Babel, and then it will be possible to write real programs containing BigInts, with these possibly-unergonomic exception semantics. That way, we can see how bad things really are.

Would that be a good way to resolve the issue for you?

mihailik commented 7 years ago

@littledan two caveats here.

  1. Can it be handled by Babel? Especially the typeof and the throwing aspects?
  2. To make it representative, it would need to be tried on real apps/sites with a mixture of integer-aware/unaware dependencies.

If (1) can be done, maybe (2) can be handled with HTTP proxying and on-the-fly transaltion of existing websites? We can list possible bug-prone patterns, and detect/inject them alongside Babel/integer conversion into the JS code when it's proxied.

Besides, that approach and proxy/translation infrastructure can be used to assess risks of other syntactical breaking changes considered for JS.

littledan commented 7 years ago
  1. I thought this might be possible with a Babel transform, but I don't know much about how Babel works. I have to look into this more.
  2. What do you mean by this on-the-fly translation? Presumably existing websites will not be ever creating an Integer.
mihailik commented 7 years ago
  1. New integer-enhanced and old code will mix a lot in the wild, but rarely in artificial test environment. The suggestion is to produce mixture scenarios on the fly off the real-life websites.

For example, we can replace whole libraries with integer-enhanced versions -- and see whether websites cope.

littledan commented 7 years ago

@mihailik Clearly it's possible to do such an on-the-fly translation into broken code. The idea of this proposal is to discourage people from using BigInts in places where they use Numbers, not to silently "integer-enhance" existing code. It's clear that what you're suggesting will lead to exceptions being thrown, but that doesn't necessarily mean that this proposal won't work. I think we need a slightly smarter method to test how bad this will be for users.

mihailik commented 7 years ago

@littledan actually, the proposal does not discourage that at all.

Consider the first code sample on the page:

function nthPrime(nth) {
  function isPrime(p) {

It's a convincing code sample for using BigInt in general-purpose integer maths. Sizeable subset of what today's JS deals with is integer maths, especially offset/length logic.

Or look at use cases explicitly encouraging overlapped use of BigInt/Number:

  • Reading certain machine registers, wire protocols
  • Protobufs or JSON documents that have GUIDs in them
  • Highly optimized cryptography calculations
  • stat may give some data as 64-bit integers
  • Accurate timestamps

Apart from grey areas of machine registers, GUIDs etc. -- the use-case of stat is a distinct clear overlap. Bit flags are used everywhere, consider things like if (msg.status&ERROR_FLAG). Migrating such code to BigInts, you're bound to leave couple odd rarely-visited if/else/switch branches silently "broken".

littledan commented 7 years ago

@mihailik Untested code in JavaScript is problematic for all sorts of reasons. For example, you may read to a missing variable, and you'll only get the ReferenceError at runtime, not compile-time. For this, there are solutions like TypeScript, which presumably would be capable of finding errors in code using BigInts as well.

littledan commented 7 years ago

36 seemed like our best shot at meeting the user intuitions expressed here, but as explained in that both of these threads, it doesn't seem like a tenable path.