Open ghost opened 4 years ago
The semantic highlight protocol is built in a way that different clients can have different tokens and modifiers even some which might not be standardize in the protocol since it will be very hard to cover all tokens and all modifiers across all languages.
@aeschli any comments on the specific request for import/include/require
. Would make sense to me. The other modifier
is IMO covered by the fact that a client can announce its own set of modifiers.
Yes, an import modifier makes sense. If the client doesn't know of a token type/modifier or shows no interest in it, then the server should not report it.
If the client doesn't know of a token type/modifier or shows no interest in it, then the server should not report it.
... which means it won't be colored at all, right? That means any more slightly unusual languages will always have a notable amount of modifiers completely uncolored (like, not even clearly set apart as someting modifier-like) without a vendor-specific editor plugin which IMHO goes entirely against the point of LSP and semantic coloring, the point that was to not require an editor-specific plugin to make things work right for the most part. I understand VS Code usually has an editor-specific plugin, but that should not be the expected model since it leads to the entire old O(n^2)
plugins problem which LSP was designed to avoid.
Is this really such an unusual view of mine? Please, add a generic modifier name. Forcing LSP servers to not report unknown modifiers at all, why would that be what the user wants?
Edit: I mean, unless the expectation is most themes won't be expected to want to color modifiers differently. But given how prominent they are in the spec the assumption appears to be most would want to do that
Oh, I see, you are talking about a token type for modifier keywords? For example the public
in this snippet:
*public* void foo() {
}
We currently just have a keyword
token type for all keywords.
The modifiers in the list are intended to be annotations ('modifiers') to declarations and references.
class A {
static void foo() {
return A.foo();
}
}
So the first foo
has classification member.declaration.static
and the second one member.static
.
So, yes, having a modifier
tokenType makes sense.
Agree, that having a modifier token type makes sense. Will add it.
Done in the library. Missing in the spec.
@aeschli sorry, I suppose my suggestion was unclear. This is what I had in mind:
export enum SemanticTokenTypes {
comment = 'comment',
keyword = 'keyword',
string = 'string',
number = 'number',
regexp = 'regexp',
operator = 'operator',
namespace = 'namespace',
type = 'type',
struct = 'struct',
class = 'class',
interface = 'interface',
enum = 'enum',
typeParameter = 'typeParameter',
function = 'function',
member = 'member',
property = 'property',
macro = 'macro',
variable = 'variable',
parameter = 'parameter',
label = 'label',
import = 'import' /* NEW !! */
}
and:
export enum SemanticTokenModifiers {
documentation = 'documentation',
declaration = 'declaration',
definition = 'definition',
static = 'static',
abstract = 'abstract',
deprecated = 'deprecated',
readonly = 'readonly',
other = 'other' /* NEW */
}
Edit: although I'm thinking now, I assume modifier can be applied to any pre-existing token on top? In that case maybe it doesn't make too much sense, and @dbaeumer 's idea of a "modifier" SemanticTokenTypes
entry for tokens that are only there to store modifiers makes more sense. Well, not sure, actually, I guess both would work...?
Yes, there's the misunderstanding. Each semantic token is annotated by a SemanticTokenTypes and any number of SemanticTokenModifiers.
Not to be confused by the keyword syntax token that describes a modifier in the source code.
For keywords we have a single keyword
SemanticTokenTypes.
Now, as discussed, we could add a new token type modifierKeyword
. Before doing that, we should see if that's a popular request. You can pioneer this by just defining such token types on your side.
In VSCode we have a mechanism where extension can declare new token types based on a super type. They can also define mappings for theming. See https://github.com/microsoft/vscode/issues/96712#issuecomment-623454115 for an example.
What this means for LSP it would be part of a language server documentation to describe the token types it uses (standard and non-standard).
Now, as discussed, we could add a new token type modifierKeyword.
That sounds reasonable, so if LSP servers can't pick a modifier the editor knows at least the base type will reflect for the themeing it's a modifier. I like that solution.
You can pioneer this by just defining such token types on your side.
I don't see the point unless a theme actually implements that on the editor side, there is no useful super type to derive from that would actually cause a theme change. (I'm also currently simply not far enough in my LSP implementation yet)
Also, I'm also in part saying this after seeing your response in https://github.com/microsoft/vscode/issues/96712 but I don't think you should delay obviously vital base types like generic modifiers or doc comments too much through having people "pioneer" it to see if it's a "popular request", because that might lead to fragmentation. I think it's pretty obvious these two types will be widely needed, so if you just let people play around with it in local experiments for too long while everyone else forgets to adopt them it'll take forever for the ecosystem to catch up. That doesn't sound like a good approach to me. But that's just my personal opinion, obviously.
See microsoft/vscode#97063.
Not sure if this is the appropriate place for this - but I've just recently been adding CompletionItemTags and DiagnosticTags to our LSP server and noticed that things like deprecated
are also supported on semantic tokens (which is good - we had a question about indicating deprecated code to editors when the lints were turned off).
However, the implementation looks quite different. Is there a reason not to model the modifiers in the same way as those tags for consistency?
I am not sure what you are asking for. You mean having the same value set?
I somehow missed the comment above - and I have no idea what I was asking for :-)
However, I came here with another question... Specifically about the class
keyword. After switching to semantic tokens I noticed many of my keywords changed colour. I tried to use TypeScript as a reference, and noticed that it doesn't seem to produce a semantic token for the class
keyword. I'm currently using the keyword
type. This is different to what we had before (it was blue, now it's pink - using the Dark+ theme) and different to TypeScript (which still has a blue "class" keyword even with semantic tokens).
So my question is - what type/modifiers should a "class" keyword have, and does there need to be something new? Excluding them from semantic tokens feels weird, but it's also weird to have a mis-match in colours between the textMate grammar and the semantic tokens because it looks quite jarring. I'd like the semantic tokens to just improve the colouring for things that are hard to do in regex, not change all the colours.
@DanTup IMO the only reason to emit keyword token types is when you have a language that so context sensitive and needs resolving to know what a modifier or keyword is. For the languages for which we have implemented a semantic highlight provider (TypeScript, HTML, Java), this wasn't the case and keywords were left to the syntax highlighter (TextMate grammar, in our case).
So admittedly, keyword
is too coarse, and if you depend on it, you want to define subtypes such as modifierKeyword
.
It's ok to add custom token types.
Once you have good insight please let us know on https://github.com/microsoft/vscode/issues/97063/ and we can look into adding them to our standard token types.
@aeschli I think this ignores the interesting use case where an IDE may not have a basic syntax highlighter, and instead just plugged in some random LSP server with no additional knowledge. (Which I know VS Code doesn't really offer to do, but other IDEs like e.g. KDE's kate do. And I wonder if maybe VS Code should offer this since the whole writing-a-plugin-for-every-IDE-again approach just seems a bit dated to me but that's for another matter.) At least for that use case it would be useful if this was changed such that useful keyword info is always returned for the common LSP backends and those subtypes were specced out in the main specs, unless that would be too performance intensive. But I assume if it is already resolved to a keyword at all, it usually isn't much extra work to also return what keyword type it is. Just a nudge in case anyone else sees this use case as useful.
For the languages for which we have implemented a semantic highlight provider (TypeScript, HTML, Java), this wasn't the case and keywords were left to the syntax highlighter (TextMate grammar, in our case).
Perhaps I misunderstood the goal of the semantic tokens in LSP. I'm also laying this on top of the textmate grammar in VS Code, however I assumed that it should be reasonable to use only semantic tokens (for a non-VS Code, LSP client) and still get a good highlight experience (this means not skipping tokens).
Now I'm not sure if that's right - are all LSP clients also expected to use something like a textmate grammar, or should semantic tokens provide a reasonable experience on their own? (I appreciate there are some other issues using only semantic tokens, like a delay in opening a file while the server warms up, but those seem like decisions for the client).
So admittedly,
keyword
is too coarse, and if you depend on it, you want to define subtypes such as modifierKeyword. It's ok to add custom token types.
I do already have some custom types (annotation
and boolean
) so adding new ones is fine, but it's not clear to me how clients would know how to colour these. For example if I add a "classKeyword" type (although I'm not sure if that's the better way or it would be better as a modifier, for example textmate seems to use things like "keyword" and "keyword.control"?), how would clients (that aren't hard-coded with my custom token type) know what it is? It might end up completely uncoloured.
It seems like the semantic tokens might not have enough options to do everything the textmate grammar can do, but I'd rather hoped that it would be superior to the grammar and that would just be a fallback.
Looking more, I think the issue here may that VS Code is mapping the "keyword" semantic token onto "keyword.control" theming. The underlying theme (dark_plus) has a "keyword" (blue) and "keyword.control" (pink). Semantic Tokens only has "keyword" and is mapping that to "keyword.control" (so everything is pink).
So maybe the best fix is that LSP has a control
modifier that can be used on keyword
and VS Code should then map those onto the respective theme scopes?
Edit: Seems like this was raised before at https://github.com/microsoft/vscode/issues/94367, but I don't entirely understand the reasoning. Seems like there may be a workaround for me there though.
I've come up with a workaround for my issue, though I don't know if helps the original request here.
I've updated my server to add a control
modifier to the keyword
tokens that are for control. I've then overridden the semanticTokenScopes
in the VS Code extension to point keyword
back to keyword
(VS Code is defaulting to keyword.control
) and then keyword.control
to keyword.control
:
// package.json
"semanticTokenScopes": [
{
"language": "dart",
"scopes": {
"keyword": [
"keyword"
],
"keyword.control": [
"keyword.control"
]
}
}
],
This seems to work as required. I can colour my keywords and control keywords separate, like the textmate grammar does (removing the jarring change), and the semantic tokens are all good semantically and can be used/styled by other editors without any quirks (and without needing their own grammar).
If control
becomes an official modifier (or type), I'll move over to it.
@aeschli could you please follow up.
@aeschli friendly ping.
It's on the backlog...
https://github.com/microsoft/vscode-languageserver-node/blob/e5d7ad881b1a51b486e3f0e4aa0fbc25dad2be58/protocol/src/protocol.semanticTokens.proposed.ts#L18 < this appears to be the basis for the latest semantic highlighting draft, and I was asked I make a separate ticket for this:
I think some
import
/include
/require
token is super common (and while it might be a function in some languages it's a special dedicated keyword in many others) and therefore I suggest an "import" keyword is added to the default built-in list of the spec. Whether themes use it I don't really mind, but I just suggest it should be available for sending through the LSP serverthe modifier list IMHO should include an "other modifier" type. For any remotely unusual language there's bound to be modifiers not covered by this list, and it would be still very useful info for a language client's themeing to know it's a modifier (rather than whatever generic keyword)