vishapoberon / compiler

vishap oberon compiler
http://oberon.vishap.am
GNU General Public License v3.0
186 stars 25 forks source link

MIN(LONGINT) written as number doesn't work on 64 bit #33

Open Oleg-N-Cher opened 8 years ago

Oleg-N-Cher commented 8 years ago
VAR l: LONGINT; BEGIN l := -9223372036854775808;

for LONGINT 64 bit gives "error 203 - number too large."

Problem is in OPS.Number. A number calculates as positive, and, after it, the sign changes. l := 9223372036854775807 works ok

WHILE i < n DO d := Ord(dig[i], FALSE); INC(i);
    IF intval <= (MAX(LONGINT) - d) DIV 10 THEN intval := intval*10 + d (* <=  positive *)
    ELSE err(203)
    END
END
Oleg-N-Cher commented 8 years ago

OPS.Number doesn't calculate negative integer numbers, only positive.

The only thing that came to mind - let OPS.Number always return a negative intval, and test it on the limit value when the sign changes.

dcwbrown commented 8 years ago

This issue exists for INTEGER and SHORTINT as well. It is also present in C/C++ compilers.

It arises because program source of the form '-9223372036854775808' is defined by the language spec not as single literal integer symbol, but rather as a monadic minus followed by a literal integer.

I came across this interesting statement in the Oakwood guidelines just last night:

5.10 Monadic ‘-’ It should be made clear in documentation supplied with compilers that monadic negation is an addition operator and has a lower precedence then the multiplication operator. For example the expression -5 MOD 3 is equivalent to -(5 MOD 3).

Which would have taken me by surprise if I was writing this code, though it does make sense.

Fortunately the workaround is both easy and one that developers in other languages are used to - either type 'MIN(LONGINT)', or type '-9223372036854775807-1'. Indeed the latter is what voc OPM.WriteInt does to represent constant values in C source code:

  PROCEDURE WriteInt* (i: LONGINT);
    VAR s: ARRAY 20 OF CHAR; i1, k: LONGINT;
  BEGIN
    IF (i = MinInt) OR (i = MinLInt) THEN
      (* abs(minint) is one more than maxint, causing problems representing the value as a minus sign
         followed by absoute value. Therefore represent as -maxint - 1. For INTEGER this avoids a
         compiler warning 'this decimal constant is unsigned only in ISO C90', for LONGINT it is the
         only way to represent MinLInt. *)
      Write("("); WriteInt(i+1); WriteString("-1)")
    ELSE i1 := ABS(i);
      s[0] := CHR(i1 MOD 10 + ORD("0")); i1 := i1 DIV 10; k := 1;
      WHILE i1 > 0 DO s[k] := CHR(i1 MOD 10 + ORD("0")); i1 := i1 DIV 10; INC(k) END ;
      IF i < 0 THEN s[k] := "-"; INC(k) END ;
      WHILE k > 0 DO  DEC(k); Write(s[k]) END
    END ;
  END WriteInt;
Oleg-N-Cher commented 8 years ago

OK, and why a modifier "L" not used for long numbers? I have some warning on MinGW gcc version 3.4.5

int main(int argc, char **argv)
{
    long long l;
    l = -9223372036854775807-1;
    l = 9223372036854775807;
    return 0;
}

TestLong.c: In function `main': TestLong.c:4: warning: integer constant is too large for "long" type TestLong.c:5: warning: integer constant is too large for "long" type

int main(int argc, char **argv)
{
    long long l;
    l = -9223372036854775807L-1;
    l = 9223372036854775807L;
    return 0;
}

The warnings too! No warnings only for numbers with LL:

int main(int argc, char **argv)
{
    long long l;
    l = -9223372036854775807LL-1;
    l = 9223372036854775807LL;
    return 0;
}

I know, GCC 3.4.5 is an older version, in newer this warning is not present. But must be standards for writing long numbers in C, we should follow.

To prevent warning, I made such modification ( by adding output "LL" ):

PROCEDURE WriteInt* (i: LONGINT);
    VAR s: ARRAY 20 OF CHAR; i1: LONGINT; k: INTEGER;
BEGIN
    IF (i = MinInt) OR (i = MinLInt) THEN Write("("); WriteInt(i+1); WriteString("-1)") (* requires special bootstrap for 64 bit *)
    ELSE i1 := ABS(i);
        s[0] := CHR(i1 MOD 10 + ORD("0")); i1 := i1 DIV 10; k := 1;
        WHILE i1 > 0 DO s[k] := CHR(i1 MOD 10 + ORD("0")); i1 := i1 DIV 10; INC(k) END ;
        IF i < 0 THEN s[k] := "-"; INC(k) END ;
        WHILE k > 0 DO  DEC(k); Write(SHORT(s[k])) END;
        IF (i < -2147483648) OR (i > 2147483647) THEN WriteString("LL") END
    END ;
END WriteInt;

But I'm not sure until the end, that should output "LL". Maybe just need to output "L", as "LL" is not supported by some C compilers? But the modifier "L" does not solve my problem to clear warning. I also confused by text of the warning. It says about type "long", but we have declared "long long".

dcwbrown commented 8 years ago

Because long is only 32 bits on LLP64 compilers. (I guess there are other scenarios, but the one that definitely affects us is building using the 64 bit windows conventions.)

I believe 'long long' is 64 bits on all modern compilers, so 'LL' may work, but I took a different approach where I have hit this issue - typecast to LONGINT:

l = (LONGINT)(9223372036854775807);

This has the advantage that it does the right thing whatever LONGINT is defined as.

Re data mdels, see also: http://www.unix.org/version2/whatsnew/lp64_wp.html

Oleg-N-Cher commented 8 years ago
long long l;
l = (long long)(9223372036854775807);

rises the same warning on GCC 3.4.5 (MinGW).

The modifier "LL" works, but it's true not for old compilers (as Turbo C). And not for some 8 bit compilers (as SDCC - Small Device C Compiler). ( I use Ofront for these too )

Well, it's time for me to update GCC and keep OPM.WriteInt as it was.

You can close this issue. Thank you.

Oleg-N-Cher commented 8 years ago

I made a mistake. suffix "LL" works on SDCC. And doesn't work in Turbo C 2.0

Oleg-N-Cher commented 8 years ago

the same with hex:

VAR l: LONGINT; BEGIN l := 0FFFFFFFFFFFFFFFFH; (* <= number too large *)

l := -9223372036854775807-1; (* <= difference too large *)

P.S. BlackBox compiler accepts these literals (a LONGINT hex writes with suffix "L" instead of "H"). And -9223372036854775808 too.

Oleg-N-Cher commented 8 years ago

Excuse me, I made a mistake again. BlackBox doesn't compile l := -9223372036854775808;

Oleg-N-Cher commented 8 years ago

David, max. number of hexadecimal digits is OPM.MaxHDig. Now this value = 8. 16 will be enough.

dcwbrown commented 8 years ago

Thanks. Since LONGINT is still 32 bits on 32 bit builds, I'll remove the constant altogether and make OPS.Number handle both 32 and 64 bit scenarios correctly.

Oleg-N-Cher commented 7 years ago

Dear David,

On MAX(LONGINT) value in OPS.Mod:

IF MAX(LONGINT) > 2147483647 THEN maxHdig := 16 ELSE maxHdig := 8 END;
IF n <= maxHdig THEN
    IF (n = maxHdig) & (dig[0] > "7") THEN (* prevent overflow *) intval := -1 END;

I find the bad idea that LONGINT size can't be 64 bits for 32-bit programs too. It's ugly.

But I understand what is hard work to implement LONGINT of such size that will be not equal to a pointer size. So in Ofront+ I try to type a LONGINT size can be arbitrarily set from 4 to 8 for each 64-bit system as well, as for 32 bit. But it is not yet implemented.

I believe strongly that you can simplify your task if maps all types to the size of SHORTINT 16 => INTEGER 32 => LONGINT 64. This is the very hard work, after which it will be possible to reap the benefits of ease in the future. And then it will be possible to translate modules, without wondering what version of Ofront may be used - for 32 or for 64 bits.

Let LONGINT be 32 bits for a user's module, but not for Ofront itself!

And if I right, you don't need to calculate maxHdig in runtime - Ofront has another good methods for checking overflows.

Now look. I haven't your modification of OPS, and I left OPM.MaxHDig = 16, as it was before. I translate this module:

MODULE TestLONGINT;
CONST Max = 0FFFFFFFFFFFFH;
VAR v: LONGINT;
BEGIN
  v := 0FFFFFFFFFFFFH;
END TestLONGINT.

When LONGINT 8 8, it translates OK. When LONGINT 4 4, we have two errors found: "number too large".

With your modification of OPS, you have the same.

What do you think?