Improve semantic tokens support

ulugbekna commented 2 years ago

Bug fixes

[ ] named arguments - occurs after renaming ~line to lines, ie F2 and type in lines on vscode

Already Fixed

- [x] handle when there's whitespace (fixed in 1.15.0, I think) This works as expected:

But not this:

Reproduction: can reproduce in _Dark+_ theme in vscode.

Feature-wise improvements:

[ ] Highlight polymorphic variants (it's complicated because variants are represented sometimes simply as strings (ie without locations information that usually should come for a symbol)

It's also unclear how to highlight them, given if they have the same color as ordinary variants (value constructors), then it's more difficult to differentiate between them at first glance, but since the color palette that the standard lsp token types give is limited, other option is to color them as modules/types, which is also far from ideal.

[ ] Adding semantic information

Currently, we simply send syntax-based semantic tokens, which doesn't really have much language semantics. By real semantic information, I mean that we could have different highlighting for ref values (rust-analyzer underlines mutable values, for example) OR highlight differently functions and values (currently, we highlight a function differently from a value only if we see syntactically that it's a function -- it comes as the function in function application or defined as a function in let binding, ie. has arguments).

In order to do so, we need information from the typed tree, ie. see the type of a value and, thus, get more semantic info about it, e.g., it's a function or a ref value. However, semantic tokens shouldn't work solely based on typed tree because there're often a lot of type errors during development in OCaml.

In my understanding, we can produce semantic tokens for symbols based on the parsetree by folding over it, but also fold over the typed tree and get information for symbols of interest and update the tokens initially produced. I'm concerned that it almost double amount of work and may increase latency.

[ ] ref values have special highlighting
[ ] functions have special highlighting
[ ] partially applied functions have special highlighting

Implementation-wise improvements:

[ ] test delta implementation
[ ] write some code to measure performance of semantic tokens computation (especially before rewriting it using visitors - see below)
[ ] rewrite the folding over parsetree using visitor design pattern?

We could use ppxlib's already implemented visitors (but we can't have ppxlib as ocamllsp dependency, so we could copy paste required files?) or use visitors library to generate visitor for the parsetree (adding it as a dep or not? if not, we just copy the ppx-generated code) We could write visitors by hand but that's too much boilerplate code.

Using visitor pattern, we could cut down on boilerplate & have much cleaner implementation.

rgrinberg commented 2 years ago

Where did you get Token_type.module_? I looked over the lsp spec, and I see no mention of this token type.

ulugbekna commented 2 years ago

The name of the variable doesn’t reflect the underlying lsp token type we use. Its value (which is 0) is the index in Token_type.list, which should be “namespace.” Does this help?

ulugbekna commented 2 years ago

The response we produce for textDocument/semanticTokens/full/delta seems bugged (there seem to be overlapping tokens), but I can't come up with a reproducible case. There's a vscode tool that can help debug: https://code.visualstudio.com/api/language-extensions/syntax-highlight-guide#scope-inspector.

rgrinberg commented 2 years ago

The name of the variable doesn’t reflect the underlying lsp token type we use. Its value (which is 0) is the index in Token_type.list, which should be “namespace.” Does this help?

It does. However, I think this level of indirection is unnecessary.

dlesbre commented 8 months ago

In addition to polymorphic variants, I found a few other example of code that doesn't have semantic token support:

module names in module signature constraint (with module Foo = ...) constraint aren't highlighted as regular module (red in the picture) name, they default to constant.language.capital-identifier.ocaml (yellow). Note that with type constraints don't have this issue.
calling functions in record fields doesn't color their arguments correctly (including named arguments)
building records with the short syntax for field=field also fails to color.

dlesbre commented 8 months ago

Another spot lacking semantic color is open declarations in .mli files (they work fine in .ml files though).

Also, related to #1139.

ocaml / ocaml-lsp