eclipse-langium / langium

Next-gen language engineering / DSL framework
https://langium.org/
MIT License
754 stars 68 forks source link

Completion Problem due to Parser Error #1302

Closed csalve closed 1 year ago

csalve commented 1 year ago

Hi,

I have a problem with the auto completion for my DSL. I have following grammar:

entry Model:
    (structs+=Struct | attributes+=Attribute)*;

Attribute:
    '@' id=INT 'attribute' name=ID;

Struct:
    'struct' name=ID '{'
        (members+=Member)*
    '}';

Member:
    StringMember | IntegerMember;
fragment MemberDetails:
    '@' id=INT
    name=ID ':';

IntegerMember:
    MemberDetails
    (type='uint' | type='int');

StringMember:
    MemberDetails
    (type='string' | type='nanoid');

When I define a member inside a struct in this language, I expect to get a proposal of one of the specific types (int, uint, string, nanoid) but this is not working:

image

Instead, I get a proposal for the attribute where it is not expected, see here: image

The struct member is parsed correcly only if its definition is complete. Obviously, then it is to late for auto completion.

I had a look if it is maybe a problem of the completion provider, but it seems that already the parser is causing the problem. The parser somehow parses the member as attribute.

I tested with langium version: 2.1.0

msujew commented 1 year ago

Hey @csalve,

yeah, looking at the AST that is being produced by the given input, I can definitely see that it tries to generate a Attribute node instead of continuing with the Struct rule.

Given that we have limited influence on the error recovery in Langium, I would instead recommend to restructure your grammar a bit. The main issue here is that you have an alternative with a fairly large prefix:

image

By removing this prefix, we can get a way better completion result:

image

See the improved grammar below:

``` entry Model: (structs+=Struct | attributes+=Attribute)*; Attribute: '@' id=INT 'attribute' name=ID; Struct: 'struct' name=ID '{' (members+=Member)* '}'; Member: '@' id=INT name=ID ':' ({infer IntegerMember} (type='uint' | type='int') | {infer StringMember} (type='string' | type='nanoid')); ```
csalve commented 1 year ago

@msujew Thanks for your help.

The improved grammar solves the issue with the parser but opens two new problems. I tried to implement your suggestion in two different ways and both of them lead to problems/errors with the ast.

Using Interfaces

The documentation tells me, that it is good practice to define interface types. So I did:

interface IntegerMember {
    type:string
}
interface StringMember {
    type:string
}
Member:
    '@' id=INT
    name=ID ':'
    (
        {IntegerMember} (type='uint' | type='int') |
        {StringMember} (type='string' | type='nanoid')
    );

But the generated AST does not give me the possibility to access the 'type' variable. There is no relation between IntegerMember and Member:

export interface IntegerMember extends AstNode {
    readonly $type: 'IntegerMember';
    type: string
}

export interface Member extends AstNode {
    readonly $container: Struct;
    readonly $type: 'Member';
    id: number
    name: string
}

Infer types without interfaces

Using the improved grammar as above creates a good AST:

export interface Member extends AstNode {
    readonly $type: 'IntegerMember' | 'Member' | 'StringMember';
    id: number
    name: string
}

export interface IntegerMember extends Member {
    readonly $type: 'IntegerMember';
    type: 'int' | 'uint'
}

export interface StringMember extends Member {
    readonly $type: 'StringMember';
    type: 'nanoid' | 'string'
}

Problem here: The 'name' and 'id' of the Member struct is never set:

image

In both described cases, langium violates the type system.

msujew commented 1 year ago

@csalve Right, I forgot that using the actions syntax removes all previous properties. We kind of inherited that behavior from Xtext and never bothered to change it. You should be good without using it:

Member:
    '@' id=INT
    name=ID ':'
    (
        type='uint' | type='int' |
        type='string' | type='nanoid'
    );

You can write your own isStringMember function if you need to that checks the value of the type property on a Member type.

msujew commented 1 year ago

Since this is a known (but unfixable on our side) issue, and it can be worked around by improving the grammar design, I'll close this.