tc39 / proposal-bigint

Arbitrary precision integers in JavaScript
https://tc39.github.io/proposal-bigint
561 stars 57 forks source link

Editorial: Rename T::remainder() T::modulo() #37

Closed msaboff closed 5 years ago

msaboff commented 7 years ago

Given that the operation is used for the % (modulo) operator, it seems like a good name.

Unfortunately, the current spec for % says that the operator yields the "remainder", but goes on to say that it isn't the same as IEEE 754 remainder but more Java remainder and C fmod.

Yaffle commented 7 years ago

@msaboff The spec uses modulo name internally for https://tc39.github.io/ecma262/#eqn-modulo - the remainder of the floored division. And I cannot find the name for "%" in the spec, except "mod-operator" in the URL.

ljharb commented 7 years ago

JS doesn't really have a modulus operator - % is "remainder". http://www.sitecrafting.com/blog/modulus-remainder/

I'd argue that the name in the actual spec should be changed to "remainder".

leobalter commented 7 years ago

I'm fine with remainder, this name is not exposed and we can rediscuss it in case we adopt a IEEE754 remainder as well in the future.

littledan commented 7 years ago

@ljharb As @msaboff mentions, there is specific spec text which explains why the word remainder is inappropriate. Does anyone have any other suggestions for the name of this operator besides modulo and remainder?

ljharb commented 7 years ago

I think "remainder" is a much closer name than "modulo", and the difference between rounding or truncating division is pretty obscure - actual JS devs tend to get more confused by calling it "mod", and I've never seen anyone misunderstand when it's called "remainder".

That said, I'm fine coming up with another name - I'm just not sure what it would be.

littledan commented 7 years ago

OK, the spec text is currently 'remainder' now. I'll leave it as is for the moment until we come up with the proper name. Note that the name is not observable from JavaScript, but if we add user-level operator overloading, we will want to expose some sort of name. (I kinda wish we could call it something like operator%).

verdy-p commented 7 years ago

I agree that the name "remainder" is intended to reproduce the behavior found in C, i.e. the remainder after a truncation of the division towards 0. But there are alot of applications that will prefer a mathematical interpretation in Modular arithmetic, so that any "modulo n" operation can only result in a NaN or a number in the range [0 n), where n is also not necessarily an integer for any arbitrary positive period, or in the range (n 0] if n is negative. When applied to defined and finite integer values only, a modular arithmetic "modulo n" can only return n distinct values, matching the expectation of mathematical modular artithmetic classes. The applications of modular arithmetics are many, including for scientific applications base on periodic signals (e.g. Fourier transforms and related computations that are often CPU intensive and should be implemented very efficiently without unexpected loss of precision or approximation errors with unpredictable bounds), or for other related operations on time periods (i.e. calendar calculation on arbitrary dates over a vast range, where it is important to not return a false guess such as a computed date falling on Wednesday instead of Tuesday, or incorrect determination of leap years bringing dates in March instead of February) So yes we also need a modular arithmetic behavior also for BigInt's, not just for Numbers. Given the curent specs the % operator will not return the expected remainder after a floored division, but the remainder of a truncated division, meaning that the result of % will be the same sign as the first parameter used as dividend, resulting in "unexpected" (2n+1) possible results for the same "BigInt % n" operation.

Note: it is not clear what is the typoe of operation performed with (BigInt % n) where n is not also a BigInt but is a Number (possibly floatting). This should probably throw a type exception if both operands are not BigInt's.

But in my opinion (BigInt%Number) should be possible with a promotion of the BigInt to Number, without loss of precision, given that the second Number operand fully determines the precision of the result: (1n % Number) should always return the same value as (1n % BigInt) but not the same type, the former returning 1 (if Number>=1) and the later returning 1n (if BigInt>=1n), or both returning 0 or 0n respectively (if the second Number or BigInt is 0 or 0n).

The result of (254+1) % (2n) is then predictable: it will be (0n) indepedantly of the limited precision of the computed (254+1) Number whose value is the same as (254). However the result of (2n54n+1n) % (2n) will be (1n).

Finally the specs do not clearly state what is the supported range for BigInt: is it just limited to 64 bits (when CPU's arythmetnic units or math coprocessors, or some GPUs frequently have support for 80-bit or 128-bit integers). If there's no limit set by the actual internal CPU register size,, as it will internally use an array and will possibly use parallelized operations to speed up some computations, there will also be some bounds on the allowed range due to memory allocation to support this datatype. This allocation, may occur in a fast pool of fixed sized buffers (even if a small part at end of buffers is left unused), for small sizes, or could use arbitrary sizes with dynamic buffers that will be garbage collected and reallocated more slowly.

But modular arithmetic is there exactly to produce bounded ranges in reasonnable limits, most of the time. And with limited sizes it should perform fast calculations. Then all kinds of precise mathematical calculation can be created with predictable results and reasonnable speed, and optimized for parallism if needed: we need modular arithmetics more than the initial spec which defines an operation that is complex to manage with changing signs of results and too many possible return values (nearly twice) for the same module, forcing applications to write sign tests everywhere or use a second modulo operation after adding the module to the first result, such as ((x%n)+n)%n when n is positive, most often a known constant value, or ((x%n)-n)%n when n is negative. This tends to be very errorprone in many operations.

If we don't have an operator for it (such as %%), the only alternative using fmod()-like functions will not work due to loss of precision with Numbers (IEEE 64-bit doubles, possibly only optimized internally for speed using integers in a vartype used by the JS engine as long as conversion to IEEE is not needed, jsut like the engine may use faster operations by not forcing IEEE roundings after each operation that does not need it).

As well, could it be possible to use a directive like "use strict", but for specifying the behavior of maths operations in a code block: IEEE rounding modes enforced or not, and modulo vs. remainder for %, or allowing/disallowing implicit promotion from BitInts to Numbers instead of TypeError beig thrown, or allowing conversion of Number to BigInt when there's no loss of precision and a suitable value range (this could improve speed without sacrificing the precision). This "use" clause would haev the effect of overriding/selecting the implementation of operators. Ideally we should also be able to write our own overrides in Javascript using "operator%(a,b){...}" functions or "operator% = ((a,b)=>{...})" assignments.

littledan commented 7 years ago

@verdy-p Let's leave the discussion here limited to the non-normative name of the internal operation, rather than a new operator which may have better semantics.

simonbuchan commented 5 years ago

@littledan the current spec text linked in your earlier comment seems to pretty clearly say that ecmascript % is definitely a remainder operator, even if potentially not the expected one. The functional difference seems to be that if x/y is negative then x mod y should be in the range [0, |y|), while x rem y should be (-|y|, 0]. This implies leaving the bigint terminology as "remainder" is good enough, and this issue could be closed?

verdy-p commented 5 years ago

if x/y is negative then x mod y should be in the range [0, y) Cannot be true if y is also negative.

The arithmetic modulo of x divided by y should always be in the range between 0 inclusive and y exclusive, and the same sign as the divisor y;

The C/C++/Java/Javascript/C#/J#/Lua/PHP... modulo ("%" operator) is different because the quotient is rounded up or down towards zero, and the associated remainder "%" is antisymetric, causing lots of caveats when performiung cyclic computations (this requires constant readjustment of the origin to avoid the barrier of zero which "breaks" the cycles). (this applies to the integer division with "/" when you round it using simple conversion of floats to integer: (int)(x/y) which rounds down or up towards zero depending on the sign of the quotient, i.e. the product of signs of its two operands). These are legacy operators implemented in languages based on the legacy type conversions in C (then reused in C++, Java, Javascript, and various programming languages, but causes many problems to implement modular artihmetic in Z/nZ: it seriously complicates the implementations as we always need to test the signs in formulas, or fix the results by first adding or substracting the period length depending on the sign of the remainder when it does not match the sign of the quotient, and then using a second "%" operation on the result). Many implemented algorithms have bugs when the dividend is negative or lower than some threshold. These operations are not enough general, but use a "simplification" which turns to generate these bugs (e.g. in many packages computing dates, or working with astronomy, or within various codecs, digital signal processing, or FFT implementations causing some unexpected out of range values)...

The two sets of operations whould be both available in all programming languages, not just the second legacy one.

I tend to prefer the first couple of operators which is consistant and gives consistant results and require much less tests in code, the second one producing bugs that are frequently harder to detect because the internal antimetric thresholds on constants changes when we work on formulas, notably when we want to use distributivity, associativity or commutativity, or want to distribute a large computation using "divide-and-conquer" strategies, which work only on more limited ranges of input values).

Le mar. 19 févr. 2019 à 02:56, Simon Buchan notifications@github.com a écrit :

@littledan the current spec text linked in your earlier comment seems to pretty clearly say that ecmascript % is definitely a remainder operator, even if potentially not the expected one. The functional difference seems to be that if x/y is negative then x mod y should be in the range [0, y), while x rem y should be (-|y|, 0]. This implies leaving the bigint terminology as "remainder" is good enough, and this issue could be closed?

simonbuchan commented 5 years ago

Cannot be true if y is also negative

Which is why I quickly edited it after posting to be absolute y. Can you please avoid repetitive, off topic rants? I agree that having both is useful, I don't think anyone really hates the idea, but this issue is for the spec text used to describe the existing semantics of %, and those semantics sure as heck aren't changing.

verdy-p commented 5 years ago

There was no such "off topic" content in what I posted, and no "rant" at all against any one or anything. You just consider that a reply to a post you made has something personal but it was not targeted at you personally.

I just describe a situation where already in this topic several unrelated things were mixed and an assumption (divisor y is positive) or missing precision in what you posted. You gave a formula using the absolute value of y, which meant that you already allowed the possibility of y being negative, meaning that the other part of your response was forgetting this case.

That's a frequent error found in much code which do no work as expected and return unexpected results.

That's why the terminology used in C specs is not clear and continues to cause confusion in most other languages that have used the legacy C definition to implement euclidian divisions. Frequently the behavior of negative values is unspecified, but many programmers forget that the integer remainder % from C is not the modulo operator used in maths literature except if both operands are positive integers and that it does not work the same if operands are other positive numbers (non integers). In general the validity ranges and supportés precision is forgotten and either the specs are incomplete or misread with incorrect and unchecked assumptions.

Things would be much less problematic if the eucliduan division and remainder behaved by default using the true arithmetic definition without the antisymetric behavior that breaks the cyclic requirement.

Anyway there are similar assumptions made by the conversion of floating points to integers, and misunderstanding of rounding modes, or simply a lack of predefinition for the mathematic operation. Or some programmers that don't want to use the floor() operation and use unchecked implicit conversions notably in C and C++, and then also expect other languages to do the same as what C is computing (including with a cyclic truncation of high bits for large values, in a value range frequently unspecified with native binary "int" types).

Le mer. 20 févr. 2019 à 20:11, Simon Buchan notifications@github.com a écrit :

Cannot be true if y is also negative

Which is why I quickly edited it after posting to be absolute y. Can you please avoid repetitive, off topic rants? I agree that having both is useful, I don't think anyone really hates the idea, but this issue is for the spec text used to describe the existing semantics of %, and those semantics sure as heck aren't changing.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tc39/proposal-bigint/issues/37#issuecomment-465714119, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUqG6RHOSKupueZ7vrfpDZX9eOeA3Mpks5vPZ3GgaJpZM4Nf5AO .

simonbuchan commented 5 years ago

So you're asking to change the behavior of % in an issue about the spec text used to describe the current behaviour, but you're not off topic, and you are "[writing] at length in an angry, impassioned way" about how the current behaviour is wrong, which is not a rant. Got it.

Before you reply again, could you please read the original post and earlier replies again, and consider that you could just open a new issue for adding the behavior you are asking for, instead of derailing this one?

verdy-p commented 5 years ago

The original was exactly about the different two operations on modulo/remainders. What I replied was fully on topic. It was not a rant, and was also containing a clarification in your own message because you were confused by the behavior with negative operands. In fact only you were ranting (against me again), If it's too long for you, don't read it, but don't be surprised if someone said that you were confused, and don't rant if someone corrects things for you because your post was incoherent in its formulas, using two distinct definitions as if they were the same thing (they are NOT).

Unfortunately these are the sort of incoherences that are widely spreaded in lot of applications using incorrect assumptions, and not seeing clearly that the "%" operator is not cyclic (unsuitable for modular arithmetic which has LOTS of uses) but antisymetric, and causes unspecified bugs in applications not clearly checking the domain of validity. You may think that this is a "rant" for these softwares, but in fact this is not at all against the "T:remainder()" and "T::modulo()" proposal.

I still maintain that the true modular operator is needed, must have a fully linear behavior, independant of arbitrary choice of offsets, should not need fixing for negative operands. It is more general and the "%" operator behavior, inherited from C, is a mess that has caused too many bugs (including security bugs, buffer overruns, crashes, incoherent results, incorrect date computings, unreusable libraries, and lot of complexity trying to override the limitation which is often hidden deeply in libraries/APIs using this "%" operator.

Le jeu. 21 févr. 2019 à 21:04, Simon Buchan notifications@github.com a écrit :

So you're asking to change the behavior of % in an issue about the spec text used to describe the current behaviour, but you're not off topic, and you are "[writing] at length in an angry, impassioned way" about how the current behaviour is wrong, which is not a rant. Got it.

Before you reply again, could you please read the original post and earlier replies again, and consider that you could just open a new issue for adding the behavior you are asking for, instead of derailing this one?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tc39/proposal-bigint/issues/37#issuecomment-466146028, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUqG4vPR3lf179cmS7e21YQR3pLj6wpks5vPvvGgaJpZM4Nf5AO .