softdevteam / yksom

Other
8 stars 6 forks source link

Multiple classes with the same name are allowed #227

Open Hirevo opened 1 year ago

Hirevo commented 1 year ago

yksom, as opposed to most other SOMs, currently doesn't require class names to match their file names, which can lead to weird behaviours where loading new classes overwrites existing globals which makes previously loaded classes unreachable.

Consider the two following classes:

" Defined in `Foo.som` "
Foo = (
  sayHello = (
    'I am the real Foo' println.
  )
)
" Defined in `Bar.som` "
Foo = (
  sayHello = (
    'I am the imposter Foo' println.
  )
)

Now, consider the following Main class:

" Defined in `Main.som` "
Main = (
  run: args = (
    Foo new sayHello.
    Bar new sayHello.
    Foo new sayHello.
    Bar new sayHello.
  )
)

Executing Main results in the following surprising output:

I am the real Foo
I am the imposter Foo
I am the imposter Foo
I am the imposter Foo

Due to the conflicting naming, it seems that loading the class from Bar.som did overwrite the #Bar global but also the already existing #Foo global, which seems undesirable.

ltratt commented 1 year ago

What do other SOMs do?

Hirevo commented 1 year ago

som-rs currently panics, like so:

I am the real Foo
thread 'main' panicked at ''System>>#load:': ./Bar.som: class name is different from file name.', som-interpreter-bc/src/primitives/system.rs:128:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

som-java also panics with a runtime exception:

I am the real Foo
Runtime Error: ProgramDefinitionError: File name ./Bar.som does not match class name (Foo) in it.
ltratt commented 1 year ago

I'd happily take a PR to make yksom behave as som-rs!

smarr commented 1 year ago

There's an implicit contract between System>>#load: and System>>resolve: and Object>>#unknownGlobal: that loading a class registers the loaded class as a global.

I believe the SOM grammar is defined to allow only for a single classdef per file:

smarr commented 1 year ago

Hm, I didn't correctly read the initial issue:

I don't believe SOM enforces a match between file and class name. There's such a check possibly in the SOM language server, but I don't think SOM's do that check.

@Hirevo the result you get seems sensible to me based on the definition of load, resolve, and unknownGlobal.

Hirevo commented 1 year ago

@smarr What made the output surprising to me is that simply loading the Bar.som class once did write to both #Bar and #Foo globals at once.

I agree that setting the global's value after loading the class is expected, but I expected only either #Foo or #Bar to get assigned that newly loaded class, not both of them at the same time.

Before running the code, My expectation was that either:

So, with this output, I initially thought that that second theory was correct but adding a println! statement in VM::load_class shows that the Bar.som class is loaded only once from file, while still overwriting both #Foo and #Bar globals:

compiling Main from /home/nicolaspolomack/Repositories/research/yksom/Main.som
compiling Foo from /home/nicolaspolomack/Repositories/research/yksom/Foo.som
I am the real Foo
compiling Bar from /home/nicolaspolomack/Repositories/research/yksom/Bar.som
I am the imposter Foo
I am the imposter Foo
I am the imposter Foo

To verify this further, rewriting Main to the following:

Main = (
  run: args = (
    ('Foo: ' + (system global: #Foo)) println.
    ('Bar: ' + (system global: #Bar)) println.
    Bar new sayHello.
    ('Foo: ' + (system global: #Foo)) println.
    ('Bar: ' + (system global: #Bar)) println.
  )
)

results in the following output:

Foo: nil
Bar: nil
I am the imposter Foo
Foo: Foo
Bar: Foo

So, now I knew that the VM itself was writing at multiple globals at once, and I found after a quick look through the code that there is indeed two calls to VM::set_global in the code path of the implementation of the System>>#load: primitive.

The first one to be called is in VM::compile (itself called by VM::load_class), where the global name used is the name of the class as found in its declaration in the AST:

https://github.com/softdevteam/yksom/blob/d262fb5f2e3f570c20f5b828d2f26ee515cfb2e1/src/lib/vm/core.rs#L279-L287

The second one is in VM::load_class itself, right after the call to VM::compile, where the global name used is the name of the symbol used for the original lookup:

https://github.com/softdevteam/yksom/blob/d262fb5f2e3f570c20f5b828d2f26ee515cfb2e1/src/lib/vm/core.rs#L269-L273

So, when loading Bar.som, the global name used in VM::compile ends up being Foo and the one used in VM::load_class ends up being Bar, hence the behaviour I observed of two globals being set at once.

(I'm sorry for the long winded response :/)

Hirevo commented 1 year ago

There's such a check possibly in the SOM language server, but I don't think SOM's do that check.

I looked back in som-java, and it seems to explicitely do this check, which is the one that triggered the error message I showed above.
But I don't know if doing this check is a required part of the SOM language.

smarr commented 1 year ago

Ok, so, since SOM-java does the name check, I guess that's the way to go.

Though, this also means there's an assumption of only a single class, and the loadClass() in SOM-java's Universe class sets the global similar to what you found for VM::load_class I think: https://github.com/SOM-st/som-java/blob/b0c8a2ac5607ef750b5bacd747da2a48f9c30f2c/src/som/vm/Universe.java#L644-L660

I also convinced myself that it's not me who introduced the check: https://github.com/SOM-st/som-java/blame/7a7c94d1a9e08c80ed01b65e55aeacc5ddf60559/src/som/compiler/SourcecodeCompiler.java#L59-L60

I somehow thought I would have. It's very odd though that the check uses != and relies on string interning. Looks like a bug...