Language Server Kernel - Githubissues

bollwyvl commented 4 years ago

Elevator Pitch

Write a wrapper kernel that can act as the proxy between a(ll) language server(s), replacing custom websocket connection wrangling.

Motivation

Extending on #184, (and thinking about #237 which i have been able to reproduce when binder is getting thrashed), and some broader comments in the community... Lab 2 reworked sessions stuff so it is/tries to be even more robust in the face of degraded connections. Trying to get (and maintain) that level of functionality will be hard for us.

Maximizing reuse of the kernel infrastructure would:

get users all the buffering/reconnection/starting/stopping basically for free
give us access to better tooling for kernel messages than we'll be able to build for our custom websocket transport
obviate the need for some of the backend complexity (at least in the manager)
- though we'd need new complexity to handle dealing with "plain" kernels for the listeners, etc... unless all of that functionality could be moved into the kernel itself
obviate the need for the clunky (and huge) webpack build of lsp-ws-connection

Design Ideas

The LanguageServerKernel would be an ipykernel.kernelbase.Kernel subclass. It would (awkwardly) show up in the kernel listing (though we could we hide it with css... but it might be cool to interact with it, don't know).

Since we'd still need the serverextension enabled (somehow #257), it could enable the kernelspec pretty directly, rather than needing another file, as something like ilsp.

The frontend would start a KernelSession for ilsp. When it came back with its kernel_info, it might be able to give us all of the server listing stuff, and we'd be able to start talking to it. Presumably the starting/stopping of language servers could be some pithy JSON-schema-constrained language, rather than being "full" python.

Much of the existing stdio machinery would be reused, but would have basically zero chance of hanging the notebook server process, which is A Good Thing. The listener API would be... more complicated to maintain, but it can't all be roses.

As for actually integrating with the kernel messaging protocol: custom comms, used by widgets and vdom, already give us a way to add this kind of feature in a way that could be used by other kernels in the future, provided they had some way to "opt in" to receiving LSP messages.

krassowski commented 4 years ago

Interesting idea. This could work if the overhead is small enough. I would see this as "one kernel to rule them all" (at least for now, to make the transition easier).

bollwyvl commented 4 years ago

Well, because windows, would probably still need threads, even for a single server.

I don't think the overhead will be that large, as kernels are pretty usable today, but preserving the listeners might be complicated.

At any rate, I don't have time to work on it right, right now, but think it would put us in a better spot for eventual upstreaming. It also wouldn't add any dependencies!

By tightly coupling to kernels, I guess we'd lose the ability to deploy a "read only" lsp-enabled site, but LSIF is better for that anyway, and not all the language servers are "safe" either.

On Wed, May 6, 2020, 00:11 Michał Krassowski notifications@github.com wrote:

Interesting idea. This could work if the overhead is small enough. I would see this as "one kernel to rule them all" (at least for now, to make the transition easier).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/krassowski/jupyterlab-lsp/issues/268#issuecomment-624431698, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAALCRCJRVJUTONKET4ACY3RQDPODANCNFSM4M2BMDMQ .

bollwyvl commented 4 years ago

Quick update: I've taken the first steps at creating such a kernel:

https://github.com/bollwyvl/jupyterlab-lsp/blob/add-language-server-kernel/py_src/jupyter_lsp/kernel/kernel.py

The good parts:

the LanguageServerManager basically Just Works, and is delivering messages
aside from having an lsp_manager, it quacks like a normal ipython kernel
- you can open up a notebook or a console and connect it to the kernel the extension uses and see what's going on it there... much better than printing stuff inside a websocket

What it does do, thus far:

instead of bespoke websockets, uses Jupyter comms (much like widgets, bokeh, vdom, and a few other things)
handles fetching available sessions

What it doesn't do, yet:

actually service the ConnectionManager
read everything off config properly (much of that was being handled by the notebook server config)

It's going to take a while, but this is probably going to work. :crossed_fingers:

bollwyvl commented 4 years ago

So I've gotten a bit of a chance to think about the comm-based connection.

Looking at what ws-connection provides us:

a more type-aware API than a raw MessageConnection
an event system, based on eventEmitter
some defaults for initialization
some syntactic sugar to turn convenience types into LSP types, e.g. ch -> character

while connection:

adds yet more sugar
some file buffering

I think I'm leaning towards:

CommRPC
- owns the IComm object (comms have one-and-only-one onMsg)
- an implementation of the client -> server (-> client)
- a Map<message-id, promise-of-response>, resolve when the response message-id comes back
- a generic implementation of server -> client (-> server)
- a Map<method-name, handler> which can handle the response and reply
  - we currently only use initialize and textDocument/publishDiagnostics, which don't reply to the server, but
  - the handlers will probably just emit signals
CommLSP extends CommRPC
- implement "raw" LSP signatures for the whole spec
LSPConnection extends CommLSP
- handle our 17 uses of sendRequest/sendNotification, preserving the idiosyncracies we've accreted to minimize the amount of changes outside of the connection stuff
  - but eventually get rid of this layer altogether, and just use "raw" LSP all the way up at the feature/document level

the pay-off i'm seeing:

no websocket concerns
- it will work as well as kernels/widgets work, which core is pretty invested in
get rid of a lot of runtime deps
- events
- vscode-*
  - vscode-languageserver-protocol will still be around at development time, for types
get rid of a lot of development deps
- webpack
- i mean, not the lab one... looks like wp5 is going to deliver the goods for no-jupyter-lab-build-on-end-user's machines :tada:, but we'll still have to deal with it
- karma
better debuggability, as well as better types
a better path forward for adding support for more messages
- basically, I want to do/automate the work once, in one file, such that we are able to write
```
await response = connection.send(SOME_METHOD_NAME, params)
```
- ...and have type inference do all the heavy lifting
the sticky point, of all things, is the actual mapping of method names to response/param types... I can't seem to find them in the type domain

other thoughts:

i'm keeping #184 and #236 in the back of my mind, but don't yet see an opportunity to bring this in: i think we'll end up having a VirtualConnection (why not!) that can aggregate multiple Connections of one sort or another, but that's just not going to happen on this go-around

stay tuned... nothing yet even worth pushing, but there is light at the end of the tunnel...

agoose77 commented 3 years ago

If performance is a concern, is this an area where the xeus kernel implementation might be of use?

bollwyvl commented 3 years ago

If performance is a concern, is this an area where the xeus kernel implementation might be of use?

Indeed, xeus-robot is already straddling the LSP/DAP/JKM divide, so if we can bring the comms-based approach home (#278 has certainly bitrotted some), there's a number of interesting things that could come out of a kernel registering itself for language server messages, and bypassing the proxy all together.

Performance is always a concern... but so is installation complexity, and ultimately maintainer sanity. At present, with the folks we have, we would likely demonstrate the MVP proxy kernel with #278's ipykernel-based implementation (since it can reuse nearly all the existing code) and would have effectively no extra dependencies.

How the comm messages work on the wire is already specified with a JSON schema, which could be re-implemented with the xeus machinery, and the two could be tested against each other (and the spec) to ensure conformance.

My biggest concern, however, is the virtual file re-writing, which is tricky stuff that I don't fully understand.

krassowski commented 3 years ago

I think that we need to have a look at how debugger handles files and comms and reconsider what will be the best approach.

October is very busy for me so well be back at it in November.

On Thu, 8 Oct 2020, 13:36 Nicholas Bollweg, notifications@github.com wrote:

If performance is a concern, is this an area where the xeus kernel implementation might be of use?

Indeed, xeus-robot is already straddling the LSP/DAP/JKM divide, so if we can bring the comms-based approach home (#278 https://github.com/krassowski/jupyterlab-lsp/pull/278 has certainly bitrotted some), there's a number of interesting things that could come out of a kernel registering itself for language server messages, and bypassing the proxy all together.

Performance is always a concern... but so is installation complexity, and ultimately maintainer sanity. At present, with the folks we have, we would likely demonstrate the MVP proxy kernel with #278 https://github.com/krassowski/jupyterlab-lsp/pull/278's ipykernel-based implementation (since it can reuse nearly all the existing code) and would have effectively no extra dependencies.

How the comm messages work on the wire is already specified with a JSON schema, which could be re-implemented with the xeus machinery, and the two could be tested against each other (and the spec) to ensure conformance.

My biggest concern, however, is the virtual file re-writing, which is tricky stuff that I don't fully understand.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/krassowski/jupyterlab-lsp/issues/268#issuecomment-705538547, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABMQBRTE4XSBZYZTMK2EZE3SJWW4TANCNFSM4M2BMDMQ .

bollwyvl commented 3 years ago

how debugger handles files

As mentioned elsewhere, DAP and LSP don't agree exactly on the definition of a file... :shrug:?

and comms

Supporting the debugger the way it was done is via a change to the kernel message protocol... might have been a new channel (vs e.g. iopub), which is very seldom changed, and few kernels support it, while most "full" kernels that might want to support LSP already support comms, because widgets are :moneybag: .

In light of the above, it doesn't bother with comms at all, to my knowledge. xeus-python does fully support comms, but, for example, xeus-robot or sqlite does not.

For our purposes, while supporting languages that do have kernels contributes a lot of complexity, we're dealing with a lot of languages that don't have kernels, so we're still going to need something for those language servers. I am still of the mind that the comm-based approach fits more cleanly into the existing architecture and ecosystem vs. our existing custom websocket/REST stuff. So having a good spec for what a LSP-over-comm.

October is very busy for me so well be back at it in November.

No worries, I think once Lab 3 drops and has a few shake-out releases, we're really going to want to get on that bus. pip install jupyterlab-lsp, and getting the labextensions (as, e.g. jupyterlab-lsp-labextension), and serverextension jupyter_lsp (or kernel as ipylsp or , eventually,xeus-lsp) is a lot nicer. For things that can be in the browser, my real hope is we'll be able to support pip install jupyterlab-lsp-<lang>-labextension, as well, to really get nodejs out of the picture for users that don't want it (me, and most package managers).

jupyter-lsp / jupyterlab-lsp

Language Server Kernel #268

Elevator Pitch

Motivation

Design Ideas