utopiabound / grpn

GTK+ reverse polish notation calculator
GNU General Public License v2.0
7 stars 3 forks source link

modulo of a negative number is expected to be positive #3

Open getreu opened 10 years ago

getreu commented 10 years ago

clib computes: -12 % 5 = -2

mathematically right: -12 % 5 = 3

This special rare case is not really a mistake, but mathematically the result is expected in [0..4]. So the result should be 3.

Not sure if this bug should be fixed in clib upstream since it is there for decades. A change could break existing applications. Should it be fixed in grpn?

getreu commented 10 years ago

http://mathforum.org/library/drmath/view/52343.html

Date: 07/05/2001 at 16:17:04 From: Doctor Peterson Subject: Re: Mod Function and Negative Numbers

Hi, Andre.

You're right that this relation is relevant to the question; below I'm going to include a more detailed answer.

But it only proves what MOD should do if we know how DIV is defined; that is, it is a statement of consistency between the mod function and the direction of integer truncation. You're assuming truncation toward zero, so that -340/60 gives -5. But in Excel, we see that the relation looks like this:

 A = ( -340 DIV 60 ) * 60 + ( -340 MOD 60 )

-340 = -6 * 60 + 20

This is perfectly consistent if their integer division truncates toward -infinity rather than toward zero, so that -340/60 is taken to be -6. And that's just what I said, using words rather than the formula:

The mod function is defined as the amount by which a number
exceeds the largest integer multiple of the divisor that is
not greater than that number. In this case, -340 lies between
-360 and -300, so -360 is the greatest multiple LESS than -340;
we subtract 60 * -6 = -360 from -340 and get 20.

So in fact I did refer to your rule.

Here's a more complete answer:

Computer languages and libraries are notoriously inconsistent, or at least unmathematical, in their implementation of "mod" for negative numbers. There are several ways it can be interpreted in such cases, and the choice generally made is not what a mathematician would probably have made. The issue is what range of values the function should return. Mathematically, we define "modulo" not as a function, but as a relation: any two numbers a and b are congruent modulo m if (a - b) is a multiple of m. If we want to make a function of this, we have to choose which number b, of all those that are congruent to a, should be returned.

Properly, the modulus operator a mod b should be mathematically defined as the number in the range [0,b) that is congruent to a, as stated here:

http://mathworld.wolfram.com/ModulusCongruence.html   

In many computer languages (such as FORTRAN or Mathematica), the 
common residue of b (mod m) is written mod(b,m) (FORTRAN) or 
Mod[b,m] (Mathematica).

http://mathworld.wolfram.com/CommonResidue.html   

The value of b, where a=b (mod m), taken to be nonnegative and 
smaller than m.

Unfortunately, this statement about FORTRAN, and implicitly about the many languages that have inherited their mathematical libraries from FORTRAN, including C++, is not quite true where negative numbers are concerned.

The problem is that people tend to think of modulus as the same as remainder, and they expect the remainder of, say, -5 divided by 3 to be the same as the remainder of 5 divided by 3, namely 2, but negated, giving -2. We naturally tend to remove the sign, do the work, and put the sign back on, because that's how we divide. In other words, we expect to truncate toward zero for both positive and negative numbers, and have the remainder be what's left "on the outside," away from zero. More particularly, computers at least since the origin of FORTRAN have done integer division by "truncating toward zero," so that 5/2 = 2 and -5/2 = -2, and they keep their definition of "mod" or "%" consistent with this by requiring that

(a/b)*b + a%b = a

so that "%" is really defined as the remainder of integer division as defined in the language.

Because FORTRAN defined division and MOD this way, computers have tended to follow this rule internally (in order to implement FORTRAN efficiently), and so other languages have perpetuated it as well. Some languages have been modified more recently to include the more mathematical model as an alternative; in fact, FORTRAN 90 added a new MODULO function that is defined so that the sign of MODULO(a,b) is that of b, whereas the sign of MOD(a,b) is that of a. This makes it match the mathematical usage, at least when b is positive.

Similarly, in Ada there are two different operators, "mod" (modulus) and "rem" (remainder). Here's an explanation with plenty of detail:

Ada '83 Language Reference Manual - U.S. Government
http://archive.adaic.com/standards/83lrm/html/lrm-04-05.html   

Integer division and remainder are defined by the relation 

    A = (A/B)*B + (A rem B) 

where (A rem B) has the sign of A and an absolute value less than 
the absolute value of B. Integer division satisfies the identity 

    (-A)/B = -(A/B) = A/(-B) 

The result of the modulus operation is such that (A mod B) has the 
sign of B and an absolute value less than the absolute value of B; 
in addition, for some integer value N, this result must satisfy 
the relation 

    A = B*N + (A mod B) 

...
For positive A and B, A/B is the quotient and A rem B is the 
remainder when A is divided by B. The following relations are 
satisfied by the rem operator: 

    A    rem (-B) =   A rem B
    (-A) rem   B  = -(A rem B) 

For any integer K, the following identity holds: 

    A  mod   B  =  (A + K*B) mod B 

The relations between integer division, remainder, and modulus are
illustrated by the following table: 

A B A/B A rem B A mod B A B A/B A rem B A mod B

10 5 2 0 0 -10 5 -2 0 0 11 5 2 1 1 -11 5 -2 -1 4 12 5 2 2 2 -12 5 -2 -2 3 13 5 2 3 3 -13 5 -2 -3 2 14 5 2 4 4 -14 5 -2 -4 1

10 -5 -2 0 0 -10 -5 2 0 0 11 -5 -2 1 -4 -11 -5 2 -1 -1 12 -5 -2 2 -3 -12 -5 2 -2 -2 13 -5 -2 3 -2 -13 -5 2 -3 -3 14 -5 -2 4 -1 -14 -5 2 -4 -4

So what's the conclusion? There are basically two models, reasonably distinguished in Ada terms as Remainder and Mod; the C++ "%" operator is really Remainder, not Mod, despite what it's often called. Actually, its behavior for negative numbers is not even defined officially; like many things in C, it's left to be processor-dependent because C does not define how a processor should handle integer division. Just by chance, all compilers I know truncate integers toward zero, and therefore treat "%" as remainder, following the precedent of FORTRAN. As C: A Reference Manual, by Harbison and Steele, says, "For maximum portability, programs should therefore avoid depending on the behavior of the remainder operator when applied to negative integral operands."