usethesource / rascal-language-servers

An LSP server for Rascal which includes an easy-to-use LSP generator for languages implemented in Rascal, and an interactive terminal REPL.
BSD 2-Clause "Simplified" License
13 stars 7 forks source link

Rascal LSP uses textmate token scopes where it should use SemanticTokenTypes for the syntax highlighting. #366

Closed DavyLandman closed 6 months ago

DavyLandman commented 6 months ago

Describe the bug

Rascal LSP is reporting the wrong kind of semantic token types in the legend. This makes it a hit and miss which ones are properly highlighted.

To Reproduce

Annotate a production with @Category{constant.numeric} and it doesn't get the colors of numeric contants of the theme, while string as category does work.

Analysis

I think the documentation on this feature has been improved, it used to be less clear what LSP expected, I've read some old GH issues discussing the closed aspect of these TokenTypes. The set of valid token type is limited to the enum mentioned in the LSP spec, which is a subset of the API in VS Code

Now why did it even work? We were passing sublime scopes like entity.name and variable.function. But VS Code is ignoring after the . (as the syntax is actually <tokentype>.<modifier> in VS Code, no LSP). So if the first name happened to be in the list (or we used something like string) it accidentally worked. But constant.numeric (or constant) doesn't exist as a TokenType, so it gets no highlighting applied. A user cannot just do @Category{number} instead, as rascal-lsp has a hardcoded list of token types, and number is not in there.

Solution

  1. replace the list with the enum form the LSP spec.
  2. consider adding a mapping of the old text-mate styles to the token types, so that existing languages keep working a reasonable manner.
jurgenvinju commented 6 months ago

Ouch! Good analysis. How about this:

This way we fix the bug and also keep in line with the rest of the LanguageServer design, and we stay backward compatible for a while.

DavyLandman commented 6 months ago

That is cool, but it does mean you have to import the util::LanguageServer module into your Syntax file, which is a dependency that we normally don't want (since we like to also be able to run our stuff without LSP dependencies).

I was currently working on this:

Pros:

  1. keep in the same style as before, without introducing extra dependencies
  2. Add support for modifiers
  3. Be a relative small patch to the existing code.

Cons:

  1. the list of strings is external, and even if we would make a copy, it would still introduce less than desired dependency edges in or grammar modules
  2. it's a string. So users can still mess up the token & modifiers, and we have no type-checking time support for these errors.
jurgenvinju commented 6 months ago

Watch out... next to the standard types and modifiers, there is more:

Along with the standard types and modifiers, VS Code defines a mapping of types and modifiers to similar TextMate scopes. That's covered in the section Semantic Token Scope Map.

DavyLandman commented 6 months ago

Watch out... next to the standard types and modifiers, there is more:

Along with the standard types and modifiers, VS Code defines a mapping of types and modifiers to similar TextMate scopes. That's covered in the section Semantic Token Scope Map.

Yes, but that is not LSP.

That's VS Code that is helping out map semantic token types to textmate scopes for the themes that do not have extensive semantic token support.

jurgenvinju commented 6 months ago

Also Custom TextMate scope mappings can be added.

We could use these to support the old Rascal-eclipse tokenType names, such as "ambiguity", etc. They can be added to the json configuration file and do not have to be directly associated with any language.

DavyLandman commented 6 months ago

Also Custom TextMate scope mappings can be added.

We could use these to support the old Rascal-eclipse tokenType names, such as "ambiguity", etc. They can be added to the json configuration file and do not have to be directly associated with any language.

Indeed, but again, this is a VS Code specific extension. LSP does not want us to do this. VS Code API of semantic tokens is not the same as LSP api of semantic tokens. By design they've limited the categories.

But I was also thinking, that yes, for the ambiguity nodes, we could try and sneak one in, that just won't work for other LSP clients.

jurgenvinju commented 6 months ago

I see your points, but I want to work towards a more ideal situation.

Syntax highlighting is a core feature of the Rascal game, in Eclipse, in VScode but also elsewhere in other LSP instances or even on the commandline. I don't like the strings since they lead to the kind of bugs that we are now fixing here. Things happily seem to work, but then they don't. Our users have to find this out.

So I guess I am proposing an extension to ParseTree.rsc where we add highlighting to the core of the parse tree representation:

jurgenvinju commented 6 months ago

Note that tags are not checked by the type checker (yet). Right @PaulKlint ?

PaulKlint commented 6 months ago

If you mean typechecker (instead of prettyprinter): I actually fixed that yesterday!

jurgenvinju commented 6 months ago

type checker! duh.. where did that come from? Oh that's cool. That will help us find a few bugs!

jurgenvinju commented 6 months ago

@DavyLandman but that means that @category needs a new name because we don't have union types in Rascal. So something like @scope for the new ones? tag SyntaxScope = syntax@scope and data Tree(SyntaxScope scope=default())

DavyLandman commented 6 months ago

Ok, so I like your idea @jurgenvinju. But since we're running into this issue at our customers and rascal is in the middle of a lot of shakeup, how about we split the fix in 2 phase.

  1. Add support for the LSP native ones to the list of supported token types, so at least a user can say @Category=number and get the proper syntax highlighting (this is a much smaller change than what I proposed in #367 )
  2. In parallel properly implement your extension to Parsetree.

So apart from this scheduling proposal, I have a small comment on the design proposed: that the LSP categories are quite limited, they designed it to force people into a small set of options that can reliably be mapped to a theme, but things we would consider normal is missing. Like for example there are only classes, interface & structures. And a lot of things will have to map to keyword.

Also, I'm not sure if scope is the best name? It alludes to both scope in language semantics, which it's not, and to text-mate scopes, which they also are not (i.e no nesting etc).

DavyLandman commented 6 months ago

Or alternative point 1: do fix the name mapping part, but drop the modifier support. So in essence: fix the bug of how category was interpreted, without adding new functionality.

I think that would be my preference. I've updated #367 with these changes.

DavyLandman commented 6 months ago

After some offline discussion with Jurgen, we came to the conclusion that #367 fixes the current bug, while usethesource/rascal#1928 will give the proper support for this. Therefore we'll keep #367 "small"