dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.92k stars 4.02k forks source link

C# semanticTokens LSP message sometimes overlap #50384

Open ryanbrandenburg opened 3 years ago

ryanbrandenburg commented 3 years ago

Steps to Reproduce: On latest VS

  1. In a C# file paste the following
    var x = $"function(e) {{ recordPopup.menuItemClick(e, {Model.Id}, '{Model.DisplayMode.ToString()}'); }}""
  2. The Semantic Tokens for the escape chars ({{) overlap with the longer token for the complete string. VS, VSCode and the Razor LSP server do not support overlapping tokens and so might run into errors or unexpected behavior.

Expected Behavior: No overlapping tokens, express strings which are broken up by other tokens as multiple string tokens which do not contain things like {{.

Actual Behavior: Overlapping tokens provided regardless of Client settings.

Question: Are there other scenarios where this might happen?

CC @allisonchou who I know worked in this area before.

Fixes https://github.com/dotnet/aspnetcore/issues/29203.

P.S. The full Razor scenario we discovered this on is in the above issue, but I figured the string above would be easier to reproduce on your end and should also express the problem.

CyrusNajmabadi commented 3 years ago

Expected Behavior: No overlapping tokens, express strings which are broken up by other tokens as multiple string tokens which do not contain things like {{.

This feels like an unnecessarily burdensome onus on all languages. Is there a reason teh platform side of this can't just support this? It feels like the desirable thing given that many things in a language are 'layered' in this fashion (esp. comments/strings). For example, say the language says // look at http://whatever.com. But a link tagger classifies the http://whatever.com as a hyperlink. That should be supported, without all the languages and individual classifications having to coordinate to chunk the string up themselves.

NTaylorMullen commented 3 years ago

If you all feel this is important enough it may be a good candidate to discuss in the weekly LSP syncs /cc @dibarbet

allisonchou commented 3 years ago

Related: https://github.com/microsoft/language-server-protocol/issues/1077

allisonchou commented 3 years ago

cc @gundermanc, I remember discussing this issue with you previously. It looks like overlappingTokenSupport has been added to the spec since we last talked. Do you know if VS has defined this client capability or if not when it may be added?

gundermanc commented 3 years ago

Do you know if VS has defined this client capability or if not when it may be added?

Currently unsupported. If this is desired, can you work with @olegtk to get this scheduled?

olegtk commented 3 years ago

good timing, we just started planning. Please create a task and start internal mail thread to schedule it. Thanks!

allisonchou commented 3 years ago

Filed https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1265495 and sent out an email for scheduling.

sharwell commented 3 years ago

No overlapping tokens

This is fundamentally not how taggers work. Layering is a feature of tags which is intentionally used by our syntax and semantic tagging features. If a downstream feature needs a non-overlapping set of tags, that feature needs to aggregate the tags from all applicable sources, and perform it's own merging logic to produce non-overlapping tag spans.

gundermanc commented 3 years ago

This is fundamentally not how taggers work.

There's an important distinction to be made here between 1) LSP and 2) in-proc taggers. While taggers support graceful merging of tags from various sources, the the LSP spec for semantic tokens did not originally and was implemented as such. Correct handling of overlapping tokens in our implementation of LSP would be accidental at best.

@sharwell if you need this capability, please work with @olegtk and I to make sure it gets scheduled.

sharwell commented 3 years ago

I read this issue as applying to the semantic tagger. Reopening since it's specific to the semanticTokens LSP message.

olegtk commented 3 years ago

Well, support for overlapping and multiline tokens is on the list of potential 16.10 features as https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1265495 (we expect to finalize planning after 1/27), but my concern is potential conflict with LSP Client refactoring that is expected to block any LSP development until 16.10 P2 at least.