eclipse-langium / langium

Next-gen language engineering / DSL framework
https://langium.org/
MIT License
754 stars 68 forks source link

Base types should not pollute sub-types with mandatory attributes #562

Open Lotes opened 2 years ago

Lotes commented 2 years ago

The setup

I have an example grammar:

grammar HelloWorld

entry Rule: {infer SubRule}name=ID|age=NUM;

hidden terminal WS: /\s+/;
terminal NUM: /[0-9]+/;
terminal ID: /[_a-zA-Z][\w_]*/;

Which generates the following types:

export interface Rule extends AstNode {
    age: string
}

export interface SubRule extends Rule {
    name: string
}

The problem

Which means that SubRules need to have an age now. Is this not wrong? If not it is a bit counter-intuitive. The user of a language should know all intrinsic rules after a while of working with it.

Side-quest

This also reminds me on the type alternative labels from ANTLR4: if there is only one alternative with a label (having a hash and a name at the end of the line) ALL other alternative should have one too.

A solution

Force the user to add code action also for the other alternatives?

spoenemann commented 2 years ago

Currently we don't consider alternatives at all in the type inference. This is a simplification that of course makes the result less "correct" / "safe", but it's intentional as it reduces the complexity a lot.

Example:

MyRule: 'a' a=ID | 'b' b=ID;

The most correct representation of the resulting type would be

type MyRule = { a: string } | { b: string }

However, these type declarations would become pretty unreadable for more complex rules. The currently generated interface is less correct, but easier to read:

interface MyRule {
    a: string
    b: string
}

Provided that we have a data structure that is able to capture all this information (#554), we could consider adding a configuration option to generate the more complex types instead of the simplified ones.