microsoft / language-server-protocol

Defines a common protocol for language servers.
https://microsoft.github.io/language-server-protocol/
Creative Commons Attribution 4.0 International
11.1k stars 777 forks source link

Clarify requirements for PartialResults #1336

Open ryanbrandenburg opened 3 years ago

ryanbrandenburg commented 3 years ago

It is unclear to me what the requirements are for the data coming in from PartialResults, particularly in SemanticTokens requests.

For example, imagine that we call textDocument/semanticTokens/full against Roslyn on a very large file/project right after startup. Would it be "legal" for Roslyn to reply with a "first-pass" attempt at Tokens which provides them from the start to the end of the file, but provides a partialResultToken which could be checked for a second-pass with more accurate results, covering the same area? If yes that implies that the array of semanticToken int's would be relative to the start of the document for each time the partialResult is queried, and that partialResults may invalidate results from previous partials. If that is NOT allowed would workspace/semanticTokens/refresh be the only way to get this "second-pass"? If it is that has significant implications for imbedded scenarios such as Razor and what they can and cannot cache.

dbaeumer commented 3 years ago

Partial results are not meant to override previous partial results in the same request. Partial results are conceptually chunked data.

Currently the only way to refresh is to call workspace/semanticTokens/refresh.

What a server can do is to cancel the request and ask the client to resend it if it is not ready to compute the result yet. So a server could postpone the computation until it is ready.

If necessary we can think about a model where a server can provide a result and in additional ask for a second request.

ryanbrandenburg commented 3 years ago

Understood, and fair that partialResult is meant only for chunking. This does get a bit difficult for embedded language scenarios though. I'll describe the scenarios we're dealing with and hopefully that will stir up some solutions.

Take a Razor SemanticTokens request at startup, today Razor makes a request against the Roslyn LSP for the semantic tokens for it's "backing" document, Roslyn replies (after quite a while because everything is just starting up and it didn't have the full context computed yet), Razor combines that reply with the Razor results and stores it in a cache. Then for any future edit requests Razor can reply without contacting Roslyn again so long as the project/document state hasn't changed. The problem being the large delay from Roslyn. Roslyn has existing functionality for providing fast "first pass" responses when the project state is not fully computed, but has no way (without going off LSP spec) to indicate to Razor not to cache this result and to try again next time to see if a new/better pass is available. Our options then become:

  1. Razor never caches C# response and must query each time to see if it has changed. This is a big computational and network cost to pay for what is largely a startup (or if you've just made large changes to your C# world quickly) problem, but it is the one solution we could come up with that doesn't require us to go off-spec.
  2. Roslyn replies to the SemanticToken request "off-spec" with an additional value indicating if this was a "incomplete pass" (and thus Roslyn should be re-queried until the pass is complete). This is the solution we're currently discussing, and it's a relatively simple fix for our scenarios but it's off-spec for LSP and perhaps too niche to be brought into the spec.
  3. C# uses the workspace/semanticTokens/refresh method to request SemanticTokens refresh. The client then sends out new textDocument/semanticTokens/* requests, but Razor doesn't know that it ought to dispose the cache (since the requests look the same as all the rest), so it either needs to intercept the workspace/semanticTokens/refresh request from Roslyn, or have some special cache-clearing endpoint that Roslyn can cause to be called. Again, we've gone way off-spec with method intercepts and/or custom endpoints.
dbaeumer commented 3 years ago

@ryanbrandenburg thanks for the explanation. Your option (2) is basically what I meant with

If necessary we can think about a model where a server can provide a result and in additional ask for a second request.

So I am actually open to add this to the spec since it is a fair use case. However we need to guard this using a client capability since it is a new feature.

Do you want to give this a try and make a PR ?

ryanbrandenburg commented 3 years ago

I'm down to give it a shot.

Is there anywhere aside from SemanticTokens that it makes sense to have this "Please ask again soon" behavior? I THINK it's only here because (to my knowledge) SemanticTokens is the only area where the responses are relative rather than self-contained so partialResult could be appropriate. But then again, if any of those areas wanted to invalidate a response in the "first-pass" we're back in "Please ask again soon" territory. Perhaps it's best to leave decisions on similar changes for other areas to those with a direct need and area knowledge.

dbaeumer commented 3 years ago

@ryanbrandenburg I would only put this into semantic tokens, but in a way that we can 'copy' it to other requests as well (naming should be semantic token independent)