zaskar9 / oberon-lang

An LLVM frontend for the Oberon programming language
MIT License
17 stars 2 forks source link

Crash triggered by accessing exported variable #44

Closed tenko closed 4 months ago

tenko commented 4 months ago

While porting some code and unittests from XDS I triggered a crash in the compiler and created a test module:

MODULE M;

CONST
    CCHAR* = 041X;
    CINT* = 100;
    CLINT* = 0FFFFFFFFFFFFFFFFH;
    CREAL* = 1.1;
    CREALMAX* = 1.7976931348623158E308;
    CBOOL* = TRUE;
    CSET* = {0, 31};
    CSTR* = "testing123";

VAR
    C* : CHAR;
    (* X* : BYTE; Not supported *)
    I* : INTEGER;
    LI* : LONGINT;
    R* : REAL;
    LR* : LONGREAL;
    B* : BOOLEAN;
    S* : SET;
    STR* : ARRAY 25 OF CHAR;

BEGIN
    C := CCHAR;
    I := CINT;
    LI := CLINT;
    R := CREAL;
    LR := CREALMAX;
    B := CBOOL;
    S := CSET;
    STR := CSTR
END M.

And a main module:

MODULE TestImport;
IMPORT M, Out;

VAR
    c : CHAR;
    i : INTEGER;
    l : LONGINT;
    r : REAL;
    lr : LONGREAL;
    b : BOOLEAN;
    s : SET;
    str : ARRAY 25 OF CHAR;
BEGIN
    (* Test CONST *)
    c := M.CCHAR;
    Out.Char(c); Out.Ln;
    i := M.CINT;
    Out.Int(i, 0); Out.Ln;
    i := M.CLINT;
    Out.Int(i, 0); Out.Ln;
    l := M.CLINT;
    Out.Long(l, 0); Out.Ln;
    r := M.CREAL;
    Out.Real(r, 9); Out.Ln;
    lr := M.CREAL;
    Out.LongReal(lr, 17); Out.Ln;
    r := SHORT(M.CREALMAX); (* SHORT needed here *)
    Out.Real(r, 9); Out.Ln;
    lr := M.CREALMAX;
    Out.LongReal(lr, 17); Out.Ln;
    b := M.CBOOL;
    Out.Int(ORD(b), 0); Out.Ln;
    s := M.CSET;
    Out.Set(s); Out.Ln;
    str := M.CSTR;
    Out.String(str); Out.Ln;
    (* Test VAR *)
    (* c := M.C *) (* This gives segmentation fault *)
END TestImport.

This creates the following output:

A
100
-1
-1
 1.10E+00
 1.100000024E+000
      INF
 1.797693135E+308
1
{ 0 31 }
testing123
  1. I guess integer constants is always LONGINT, so that the problem with sign extension where the integer value is treated as an unsigned value is not a problem here.
  2. The decimal float value 1.1 is inexact represented in base 2 floating point representation and the constant will have different value depending on whether the type is REAL or LONGREAL. The Oberon-02 report solves this with creating an additional float format were 1.1 and 1.1E0 is REAL, while 1.1D0 is LONGREAL.
  3. The last line commented out triggers the crash here.
  4. Oberon-07 limit exported constant to integer type. e.g. REAL not allowed to be exported. This is probably done to keep the compiler as simple as possible and I do not see any logical reason for that.
  5. Exported variables in Oberon-07 is read only. This makes sense as it forces the module state to be encapsulated.

EDIT : Added notes on Oberon-07.

zaskar9 commented 4 months ago
  1. I guess integer constants is always LONGINT, so that the problem with sign extension where the integer value is treated as an unsigned value is not a problem here.

The constants are recognized as follows (output from debug build).

CONST CCHAR*(*Scope:1*) = "A"(*C*)(*Type:CHAR*);
   CINT*(*Scope:1*) = 100(*S*)(*Type:SHORTINT*);
   CLINT*(*Scope:1*) = -1(*L*)(*Type:LONGINT*);
   CREAL*(*Scope:1*) = 1.1(*F*)(*Type:REAL*);
   CREALMAX*(*Scope:1*) = 1.79769e+308(*D*)(*Type:LONGREAL*);
   CBOOL*(*Scope:1*) = TRUE(*Type:BOOLEAN*);
   CSET*(*Scope:1*) = { 0, 31 } (* 2147483649: 10000000000000000000000000000001 *)(*Type:SET*);
   CSTR*(*Scope:1*) = "testing123"(*Type:STRING*);

-1 begin a recognized as a LONGINT might not be ideal.

  1. The decimal float value 1.1 is inexact represented in base 2 floating point representation and the constant will have different value depending on whether the type is REAL or LONGREAL. The Oberon-02 report solves this with creating an additional float format w[h]ere 1.1 and 1.1E0 is REAL, while 1.1D0 is LONGREAL.

This is actually already in Oberon-90. I can extend the Scanner accordingly. But I wonder whether it would make sense to lift the decision whether a literal is treated as REAL or LONGREAL to Sema as the Scanner has no information on whether the compiler is in Oberon-90, Oberon-2 or Oberon-07 mode.

  1. The last line commented out triggers the crash here.

Not surprising as I've never put any effort into exported variables, nor have I tested that. I'm sure this can be supported relatively easy by setting the correct linkage type in CodeGen.

  1. Oberon-07 limit exported constant to integer type. e.g. REAL not allowed to be exported. This is probably done to keep the compiler as simple as possible and I do not see any logical reason for that.

Wirth moves in mysterious ways?

  1. Exported variables in Oberon-07 is read only. This makes sense as it forces the module state to be encapsulated.

This can be supported by adding corresponding checks to Sema.

tenko commented 4 months ago

Excellent.

  1. 0FFFFFFFFFFFFFFFFH is in signed int64 equal to -1. Out.LongHex(l) prints FFFFFFFFFFFFFFFF. We would have the same problem with sign extension of "hex" values if the constant is inferred as a smaller type.
  2. No show stopper.
  3. FI can work around the missing exported variable by using procedures.
  4. I suspect there are several corner cases where Project Oberon have simplified things to keep is simple.
  5. I think Oberon-2 allow for read/write variables and have the extra - export marker for read-only variables.

EDIT : Updated 1.

zaskar9 commented 4 months ago

This should be fixed in the latest pull request. The following works for me now (macOS, arm64) as expected with the test modules that you provided, @tenko!

oberon-lang -I.:./include -L./lib -loberon M.Mod
oberon-lang -I.:./include -L./lib -loberon -fenable-main TestImport.Mod
clang -o TestImport TestImport.o M.o -L./lib -loberon -rpath @executable_path/lib
./TestImport

Let me know if it fixes the issue for you too! Thanks!

tenko commented 4 months ago

Excellent. This fixes the issue for me.