openlawlibrary / pygls

A pythonic generic language server
https://pygls.readthedocs.io/en/latest/
Apache License 2.0
584 stars 103 forks source link

Debounce and caching #283

Open tombh opened 2 years ago

tombh commented 2 years ago

Just wondering out loud whether debouncing and caching is something that Pygls could support?

The classic example is completion requests, they're potentially sent on every keystroke, which, depending on the implementation could be quite expensive, even before considering a separate completionItem/resolve step. One solution is to debounce the requests, say for a few 100ms, but that still requires sending something back to the client for each request, hence the need for also providing a cache.

Debouncing and caching aren't the hardest things in the world, but still they can be a bit fiddly to get all the little details right. What's more, each LSP feature has its own debounce and caching requirements. Say for instance, diagnostic requests don't need an immediate response, so don't need caching, but they should defer the most recent request to the future in order to be published after the debounce limit. Formatters probably don't need either debounce or caching. And so on.

So could Pygls itself be a good place to at least offer APIs and recommended approaches?

llllvvuu commented 1 year ago

A nice API for this could be via the decorator:

@ls.feature(TEXT_DOCUMENT_DID_CHANGE, debounce_interval=100)
def did_change(params: DidChangeTextDocumentParams):
   ...

On the other hand, one could make the argument for making these separate utilities:

@ls.feature(TEXT_DOCUMENT_DID_CHANGE)
@debounce_and_throttle(100, 100)
@swr(...)
def did_change(params: DidChangeTextDocumentParams):

Actually it's interesting that there isn't any lodash of Python that could be a defacto standard for this kind of stuff. Maybe Python's package system is less amenable to micro-libraries.

IDK about caching, maybe it can be up to the client side to decide to show stale content. The server can just return an error.

Note that clients often do their own debouncing; for example Neovim defaults to 150ms: https://neovim.io/doc/user/lsp.html#:~:text=debounce_text_changes

So possibly server-side debounce interval should be configurable if the user doesn't feel double debounce is necessary.

tombh commented 1 year ago

I like the dedicated decorator idea.

You had more comments before about caching I think? My assumption is that without caching the editor might flash diagnostics on and off. If the editor receives an empty diagnostics lists then it assumes that everything was fixed right?

Ah yes, I didn't think about how editors might have their own debouncing, good point.

So possibly server-side debounce interval should be configurable if the user doesn't feel double debounce is necessary.

Do you mean some other config apart from the decorator args in @debounce_and_throttle(100, 100)?

llllvvuu commented 1 year ago

If the editor receives an empty diagnostics lists then it assumes that everything was fixed right?

For diagnostics specifically, those are one-way notifications rather than request/response. So the server can just not send a notification. In fact pygls handler for textDocument/didChange expects None I believe. For request/response like textDocument/completion indeed returning an empty list would be problematic. You'd need to return an error code instead. Although LSP might not have a dedicated error code, so maybe caching is fine. Another thing one could try is, instead of responding to request N with request N-1's return value, respond to request N with request N+1's return value.

Do you mean some other config apart from the decorator args in @debounce_and_throttle(100, 100)?

Nah you're right, just passing a variable in instead of hardcoded 100 works. That's what I ended up doing.

llllvvuu commented 1 year ago

Ah yes, I didn't think about how editors might have their own debouncing, good point.

Yeah we might be in uncharted territory here, I suspect language servers tend to debounce diagnostic notifications and file watching/indexing but not request/response, assuming that the client side will handle this / the server will never get ddos'ed / all the request/response-based methods are fast enough to not need it (completions might actually be the only case one could argue and I don't know of any language servers that don't just compute/serve every completion)

llllvvuu commented 1 year ago

Another reason to have it be a separate utility is that in many cases you probably want to debounce some subroutine and not the handler itself:

@ls.feature(TEXT_DOCUMENT_DID_CHANGE)
def did_change(params: DidChangeTextDocumentParams):
   update_index(params.textDocument)  # must be fresh
   debounce(0.2)(validate(params.textDocument))  # slower / fine to skip

or if validate should always be debounced:

@ls.feature(TEXT_DOCUMENT_DID_CHANGE)
def did_change(params: DidChangeTextDocumentParams):
   update_index(params.textDocument)  # must be fresh
   validate(params.textDocument)

@debounce(0.2)  # slower / fine to skip
def validate(textDocument):
  ...
tombh commented 1 year ago

For diagnostics specifically, those are one-way notifications rather than request/response. So the server can just not send a notification.

Ah yes, of course.

instead of responding to request N with request N-1's return value, respond to request N with request N+1's return value

Such an elegant tweak, nice.

Your Fixit project looks interesting. I've been working on a framework too for making it easier to write custom LSP functionality, here's my debounce code: https://github.com/tombh/super-glass-lsp/blob/main/super_glass_lsp/lsp/custom/features/_debounce.py It's been a while since I worked on it, and I notice that I do debounce formatting requests, even though in my first comment above I suggest that formatters don't need debounce, and you're saying that completion requests are probably the only thing that needs debouncing. Thinking about it now, I do feel like debounce is useful for formatting requests. Can you think of a reason why it isn't?

assuming that the client side will handle this / the server will never get ddos'ed

One of the reasons I'd like to include debouncing in Pygls is that it provides a mechanism for server authors to be both prompted to think about debouncing where it's relevant (eg; we could provide a default @debounce(0) on relevant features) and to encourage a way to provide a recommended debounce time that's tailored to the specific tool/algorithm generating the response. The latter being something that a generic client-side debounce wouldn't know about.

@debounce(0.2)
def validate(textDocument):

Yeah that's exactly how I'd imagine it.