lboasso / oberonc

An Oberon-07 compiler for the JVM
MIT License
148 stars 17 forks source link

Name Conflict in Inner Procedures with Different Parent Procedures #28

Closed geekstakulus closed 1 year ago

geekstakulus commented 1 year ago

Here's an enhanced version of the report:

Introduction

In Oberon-07, it's possible to declare inner procedures, which are accessible within the scope of the outer procedure only. This allows the use of inner procedures with the same name, similar to how variables and constants can have the same name in different scopes.

Current Behavior

When declaring two procedures, A and C, each with its own inner procedure named B, the compiler throws an error indicating that procedure names must be unique.

Expected Behavior

It's expected that declaring identifiers (procedures, variables, or any other type of identifiers) within different scopes should not be problematic. The current issue may arise due to the way the code is translated into JVM class files. However, it should be possible to have multiple inner procedures with the same name as long as they are within different outer procedures.

Repro Steps

  1. Write a valid Oberon-7 code that involves the use of inner procedures with the same name in distinct outer procedures.
MODULE Scopes;
  IMPORT Out;

  PROCEDURE A;
    PROCEDURE B;
    BEGIN
      Out.String("Hello from A/B"); Out.Ln
    END B;
  BEGIN
    B
  END A;

  PROCEDURE C;
    PROCEDURE B;
    BEGIN
      Out.String("Hello from C/B"); Out.Ln
    END B;
  BEGIN
    B
  END C;
BEGIN
  A;
  C
END Scopes.
  1. Compile the code using the Oberonc compiler.
  2. Observe the name conflict error being raised:
Scopes.Mod:15:0: procedure names must be unique
Scopes.Mod:24:11: compilation FAILED

Modified Code (Working Solution)

To work around this issue, one needs to rename the procedures, and in the example that follows the inner procedure name is prefixed with the name of the outer procedure in lower case. This approach eliminates the error.

MODULE ScopesSolution;
  IMPORT Out;

  PROCEDURE A;
    PROCEDURE aB;
    BEGIN
      Out.String("Hello from A/B"); Out.Ln
    END aB;
  BEGIN
    aB
  END A;

  PROCEDURE C;
    PROCEDURE cB;
    BEGIN
      Out.String("Hello from C/B"); Out.Ln
    END cB;
  BEGIN
    cB
  END C;
BEGIN
  A;
  C
END ScopesSolution.
  1. Compile and run the modified code using the Oberonc compiler.
  2. Observe that the modified code works correctly without raising naming conflicts.

Other Information

This issue seems to be specific to the Oberonc compiler and the way it generates Java class files, as indicated by the following decompiled code using javap:

public final class ScopesSolution {
  public static java.lang.String[] args;

  public static final void aB();
    Code:
       0: ldc           #7                  // String Hello from A/B\u0000
       2: invokevirtual #13                 // Method java/lang/String.toCharArray:()[C
       5: invokestatic  #19                 // Method Out.String:([C)V
       8: invokestatic  #23                 // Method Out.Ln:()V
      11: return

  public static final void A();
    Code:
       0: invokestatic  #28                 // Method aB:()V
       3: return

  public static final void cB();
    Code:
       0: ldc           #31                 // String Hello from C/B\u0000
       2: invokevirtual #13                 // Method java/lang/String.toCharArray:()[C
       5: invokestatic  #19                 // Method Out.String:([C)V
       8: invokestatic  #23                 // Method Out.Ln:()V
      11: return

  public static final void C();
    Code:
       0: invokestatic  #34                 // Method cB:()V
       3: return

  public static void main(java.lang.String[]);
    Code:
       0: aload_0
       1: putstatic     #39                 // Field args:[Ljava/lang/String;
       4: return

  static {};
    Code:
       0: iconst_0
       1: anewarray     #9                  // class java/lang/String
       4: putstatic     #39                 // Field args:[Ljava/lang/String;
       7: invokestatic  #43                 // Method A:()V
      10: invokestatic  #45                 // Method C:()V
      13: return
}

It's possible that a solution could involve generating procedure names that are prefixed with the parent procedure's name to avoid naming conflicts.

Please note that this bug report is intended to highlight the observed behavior and suggest a possible solution for consideration.

lboasso commented 1 year ago

Thanks for reporting this bug! I'll push a fix soon.

lboasso commented 1 year ago

Fixed here. With this commit I use a global counter to keep track of all nested procedures and types, so that I can make short unique names .

geekstakulus commented 1 year ago

Fixed here. With this commit I use a global counter to keep track of all nested procedures and types, so that I can make short unique names .

This seems like a sensible solution. Just tested it and it is now working. Thanks!

I found this bug because he implemented the exact same inner procedure twice. He could have made it global, but he decided to duplicate the work and used the exact same procedure nested in two distinct procedures. It seems like he violates his own rules. LoL

Thanks for the fix!

lboasso commented 1 year ago

Great! Could you share the original Modula-2 compiler code? Was it taken from https://www.astrobe.com/Modula2/ ?

geekstakulus commented 1 year ago

Great! Could you share the original Modula-2 compiler code? Was it taken from https://www.astrobe.com/Modula2/ ?

Yes, those are the ones I initially used as a basis, which are effectively found here as well https://github.com/GunterMueller/ETHZ-Modula-2_Compilers. The one I am porting to Oberon is this one https://github.com/GunterMueller/ETHZ-Modula-2_Compilers/tree/master/M2SPCompilerSource

geekstakulus commented 1 year ago

I was reading the Project Oberon book and found the descriptions of the restrictions around nested procedures not being permitted to access to variables on a level up.

Furthermore, the original book also mentions that they did not take advantage of Oberon's type extensions, because the compiler was written in Modula-2, and Modula-2 did not support type extensions. The conversion is quite straightforward to be honest, because both languages are so similar to one another. Where you run into a hiccup is when you have to translate untagged variant records. For instance, they have the following procedure that uses them:

  PROCEDURE ErrorBlock(errPos: LONGINT; errCod: CARDINAL);
    VAR i: CARDINAL;
        conv: RECORD
                 CASE :CARDINAL OF
                  0: long: LONGINT
                | 1: sys:  ARRAY [0..3] OF CHAR;
                END;
              END;
  BEGIN
    IF errPos >= 2D THEN errPos := errPos - 2D END;
    WriteChar(errDat, ERR);
    WITH conv DO
      long := errPos; FOR i := 3 TO 0 BY -1 DO WriteChar(errDat, sys[i]) END;
    END;
    WriteChar(errDat, CHR(errCod MOD 256)); WriteChar(errDat, CHR(errCod DIV 256));
  END ErrorBlock;

Here, you can see that the whole purpose of the conv structure is to encode the error position using a sequence of four bytes in big endian order through the array of chars, which has the same size as the LONGINT type (4 bytes). This could effectively be done through bitwise manipulation, but in Modula-2 this is how they do it "sort of" automatically. Anyway, the following chunk:

    WITH conv DO
      long := errPos; FOR i := 3 TO 0 BY -1 DO WriteChar(errDat, sys[i]) END;
    END;

I translated to the following in Oberon:

    Files.WriteChar(errDat, CHR(ASR(errPos, 24)));
    Files.WriteChar(errDat, CHR(ASR(errPos, 16)));
    Files.WriteChar(errDat, CHR(ASR(errPos, 8)));
    Files.WriteChar(errDat, CHR(errPos));

The code of this compiler when compared to the Oberon and all derivatives is so similar that you can see that he just converted from Modula-2 to Oberon, and his reasoning was that you could also port to other languages with ease. I found the answer I was looking for his not using OOP in the implementation of the compiler.

If you look at any of his compilers, they all follow the same structure, and that includes Pascal. Look at the source code of any of his compilers and you know how to get around the code if you know one of his compilers. Pretty easy to follow, actually.

lboasso commented 1 year ago

If you look at any of his compilers, they all follow the same structure, and that includes Pascal. Look at the source code of any of his compilers and you know how to get around the code if you know one of his compilers. Pretty easy to follow, actually.

Yes I agree, but sometime a comment or two would have helped :)

geekstakulus commented 1 year ago

If you look at any of his compilers, they all follow the same structure, and that includes Pascal. Look at the source code of any of his compilers and you know how to get around the code if you know one of his compilers. Pretty easy to follow, actually.

Yes I agree, but sometime a comment or two would have helped :)

You are dead right on that! I agree with you 100%.