Open drhagen opened 10 months ago
Ok, fascinating. This is a really hard to catch edge case in very special grammars for completions within the first token of a file. I'm honestly suprised someone was able to create reproduction steps for this. Kudos, I guess. We basically run into this branch, which then later assumes that no tokens have been parsed. As a consequence it doesn't even attempt to fuzzy match the previous code to override it. This logic got fairly recently into Langium, whereas the playground lags behind a minor version, which is why it doesn't exhibit the behavior.
I'm not sure whether we can actually change this part of the logic though. The fuzzy matcher isn't allowed to look too far back in the token stream to find the text to replace. It should only look for the current token, which is exactly what's happening right now. In some cases, the current token just cannot be lexed, which leads to the behavior you're experiencing.
within the first token of a file
I minimized this down, but failed autocompletion can trigger further than the first token, unless we have different definitions of "token".
For example, using this grammar:
grammar ReactionModel
entry ReactionModel:
EOL? '%%' 'ReactionModel@2' EOL
'initialization' '=' initialization=Initialization EOL
'%' 'components' EOL
;
Initialization:
InitialValue | SteadyState;
InitialValue:
{infer InitialValue} 'initial_value' '(' ')';
SteadyState:
'steady_state' '(' 'time_scale' '=' time_scale=FLOAT (',' 'max_scale' '=' FLOAT )? ')';
hidden terminal WS: /[ \t]+/;
terminal EOL: /((#.*)?\n[ \t]*)*(#.*)?((\n[ \t]*)|\Z)/;
terminal FLOAT returns number: /[+-]?\d+(\.\d+)?([Ee][+-]?\d+)?/;
with this valid file
%% ReactionModel@2
initialization = steady_state(time_scale = 1.0, max_scale=1.0)
% components
not a single keyword autocompletes correctly while typing it in or when going back to edit it. It knows what can be autocompleted there (e.g. after "initialization =" then "steady_state" or "initial_value" are valid autocompletes), but it types in the whole word instead of completing the word.
@drhagen Let me rephrase: For example initial
- in your language - isn't actually a token (even though initial_value
is), since there's neither a keyword nor something like an ID
terminal that could lex it. Instead, the lexer simply ignores the characters. Since we can only know where a token ends/starts if the lexer recognizes it's a token, the completion provider assumes that the characters before the cursor position are invalid characters and ignores them as well. This is actually independent of the issue that we don't lex any tokens at all - the issue is really that we have no idea "how much" of a token already exists at a given point.
In order to successfully perform completion, even "broken" keywords need to be recognized as tokens by the lexer. Most languages (i.e. all that I've encountered so far) have an ID terminal that can be expressed as /\w+/
, which automatically fixes this issue.
I don't think we can fix this as part of our framework. You are free to override how the completion provider attempts its fuzzy matching, so you should be able to fix this behavior for your language yourself.
A grammar of a keyword followed by
/[A-Z]+/
will not correctly autocomplete the keyword, but the same keyword followed by/[a-z]+/
will autocomplete just fine. This might be a bug on the VS Code side because the same grammar in the Langium Playground autocompletes fine.Langium version: 2.1.3 Package name: hello-world
Steps To Reproduce
npm install -g yo generator-langium
yo langium
hello-world.langium
with:entry Model: 'header' value=ID;
hidden terminal WS: /\s+/; terminal ID: /[A-Z]+/;
npm run langium:generate
npm run build
test.hello
he<tab>
The current behavior
When starting to type the keyword, the correct completion appears. But when pressing Tab or Enter to accept the autocomplete, it types in the whole keyword again instead of the remainder of the word.
Now switch
ID
from/[A-Z]+/
to/[a-z]+/
. Rebuild and restart the extension. With this grammar autocomplete works as expected.The expected behavior
Autocomplete completes the keyword instead of typing the whole keyword in again regardless of the token that follows.