Advise on closer jupyterlab-lsp integration with Jupyter(Lab)

jupyterlab / frontends-team-compass

A repository for team interaction, syncing, and handling meeting notes across the JupyterLab ecosystem.

https://jupyterlab-team-compass.readthedocs.io/en/latest/

BSD 3-Clause "New" or "Revised" License

57 stars 30 forks source link

Advise on closer jupyterlab-lsp integration with Jupyter(Lab) #67

Open krassowski opened 4 years ago

krassowski commented 4 years ago

An overview of jupyterlab-lsp

jupyterlab-lsp is a project aiming to provide full language support to the Jupyter projects, implementing the language-server-protocol, which was developed by Microsoft for the VSCode to address the challenge of supporting multiple langue-specific features in an editor-agnostic way.

jupyterlab-lsp is a meta-repo composed of three fronted packages and one Python backend package: (click to see more details)

- [jupyterlab-lsp](https://github.com/krassowski/jupyterlab-lsp/tree/master/packages/jupyterlab-lsp) is the fronted plugin to JupyterLab responsible for: - implementing the CodeMirror features, - was originally based on the [wylieconlon/lsp-editor-adapter](https://github.com/wylieconlon/lsp-editor-adapter), but has since been rewritten to better fit the JupyterLab integration needs. - statusbar, - diagnostics panel, - integration with the JupyterLab. - transparent notebook handling: notebook code cells are transparently collapsed into a single continuous source code ("virtual source") by the fronted and this presented to the language server on the backend, as the LSP protocol does not support notebooks natively. - [jupyterlab-go-to-definition](https://github.com/krassowski/jupyterlab-lsp/tree/master/packages/jupyterlab-go-to-definition) is an older project of mine which provides a static code analysis in-browser for faster jump-to-definition action (if the language server is not available), but more importantly the actual jump implementation. It is planned that it will be stripped down of the static-analysis part in the feature and only the core functionality of "jumping" will remain in the meta-repo. - [lsp-ws-connection](https://github.com/krassowski/jupyterlab-lsp/tree/master/packages/lsp-ws-connection) is a fork of [wylieconlon/lsp-editor-adapter](https://github.com/wylieconlon/lsp-editor-adapter) stripped down of the CodeMirror integration part, used only to handle the websocket connections. A recent proof-of-concept work by @bollwyvl demonstrates that we could remove most of it in favour of [implementing the communication via a custom kernel](https://github.com/krassowski/jupyterlab-lsp/issues/268). This is however still in early stages. - [jupyter-lsp](https://github.com/krassowski/jupyterlab-lsp/tree/master/py_src/jupyter_lsp) is a Python package providing serverextension that: - spawns and communicates with various languages servers; - maintains a virtual shadow system which keeps the collapsed source code of notebooks on the disk, as too many servers do not follow the LSP specification and require a hard copy of the edited file to be on the disk for certain features to work. - it allows for adoption of the LSP by other frontends of Jupyter We also plan on making the extension more modular in the future (help wanted).

The corresponding issue in our repo is available here: Orbiting closer to Jupyter #238

Proposed scenarios/actions

I would like to ask the wider community here to provide feedback on the following options:

1) Move jupyterlab-lsp repository into the JupyterLab

Pros:

would live closer to the JupyterLab
would signal a community effort and potentially encourage more contributions

Cons:

the codebase is complex and has (too?) many dependencies - it may need a brave and experience reviewer

1b) In future, incorporate parts of jupyterlab-lsp to the JupyterLab core?

may require stripping some things away, such as the tests with R language would slow down the CI substantially

2) Move it to another satellites org

See https://github.com/jupyterlab/team-compass/issues/52.

3) Propose it as an official subproject of Jupyter

As it fulfills wider Jupyter-ecosystem need and can support multiple frontends.

Given not-yet-mature state, we could live in the Jupyter incubator for some time.

4) Move it to a dedicated organisation, but not as an official sub-project

The new organisation could be called JupyterLSP.

Pros:

signals a serious intent for the project to be a collaboration and not a hobby/private side-project
easy to add more repositories if needed

Cons:

is not official
may confuse users and potential contributors

Next post with Criteria for official Subprojects will follow.

krassowski commented 4 years ago

Criteria for official Subprojects

Have an active developer community that offers a sustainable model for future development.

I (@krassowski) and Nick (@bollwyvl) are the main developers of the project. We are welcoming of new contributions and contributors with total of four other developers having committed to the master in the past, and major patches from two more awaiting for the merge:

settings editor by @trajamsmith awaiting for resolution of third-party language server issue, and
- migration to ICompletionItems by @edzkite awaiting JupyterLab 2.2 or 3.0 release.

Speaking for myself I have a limited time to offer given the reality of the academia; I am juggling between priorities and trying to find which commitment will be the most beneficial for the research I am engaged in and for the other scientists who may use the tools that I contribute too. As I decided to use the JupyterLab as my major IDE for my PhD (which is working well), I will most likely remain committed to supporting this extension for the next 3+ years.

As I no longer work as a software engineer my choices often prioritise having something work good enough now rather than it being perfect in a months time. This is where @bollwyvl's expertise and experience is helping - I think that the compromises that we work out are often a good way forward.

Have an active user community.

I do not know how you assess that but we have a number of issues submissions (152), forks (28) and stargazers (436) at the time of writing.

Use solid software engineering with documentation and tests hosted with appropriate technologies (Read The Docs and Travis are examples of technologies that can be used).

Travis: tests the jupyter-lsp for various Python versions and operating systems
Read The Docs: is built with markdown documents, Jupyter notebooks; the list of supported language servers is auto-generated.
Azure Pipelines: : example report
- test the fronted using acceptance tests in Robot framework with selenium driver,
- run unit tests (Jest and pytest)
- ensure linting of codebase (prettier, pep8 etc)
- test packaging and docs

More on the current architecture in docs.

Demonstrate continued growth and development.

We are certainly getting more and more users, but we slowed down with adding new features trying to focus on stability. A big challenge for me is to stay up to date with new JupyterLab releases.

Integrate well with other official Subprojects.

jupyterlab-lsp is kind of competition to now-dead jupyterlab-monaco (last commit 2 years ago).
jupyter-lsp is frontend-agnostic and can be used by other Jupyter frontends/subprojects.

Be developed according to the Jupyter governance and contribution model.

Not yet there. But we do:

have a nice CONTRIBUTING.md document
point to the Jupyter Code of Conduct
use BSD-3-clause licence

Have a well-defined scope.

We aim to provide the full Language Server Protocol support to JupyterLab, including:

standalone documents in the file editor
notebooks
code cells with a different language than the main language of the notebook (e.g. %%javascript, %%bash of IPython or %%R rpy2)
a GUI for the components related to the LSP features (diagnostics panel/refactor window etc).

The scope of jupyterlab-lsp excludes:

debugger
support for editors other than the main JupyterLab editor
creation and support of custom language servers

Be packaged using appropriate technologies such as pip, conda, npm, bower, docker, etc.

jupyter-lsp is on PyPI,
the typescript packages extensions are on npm
@fcollonval proposed a conda recipe for conda-forge back in April, but the conda-forge team seems to have a backlog to handle before accepting it: https://github.com/conda-forge/staged-recipes/pull/11280

echarles commented 4 years ago

I have been using since a few weeks jupyterlab-lsp and I am very impressed with the added features it brings to the enduser. As developer, I am a bit intrigued with the additional websocket channel it created and the additional .virtual_documents it creates.

The enduser adoption is rising from the NPM download stat.

I see this extension as being a key value for jupyterlab, hence I like option 1) Move jupyterlab-lsp repository into the JupyterLab with potentially later 1b) Incorporate parts of jupyterlab-lsp to the JupyterLab core.

choldgraf commented 4 years ago

I think this is a fantastic idea, and I really like the idea of splitting this into a JupyterLab-specific thing, and a Jupyter core thing. Once we get LSP support in Jupyter then we can also add support for the flavor of markdown that Jupyter Book uses :-)

For the Jupyter-general pieces of the LSP, does it make sense to create a JEP for this to discuss across the community? As you say, I could see this being useful for a number of different projects.

Or alternatively, perhaps the issue of "how to incorporate jupyterlab-lsp into jupyter could also be a JEP, so that discussion happens across the broader community instead of just inside jupyterlab?

krassowski commented 4 years ago

Thank you for the feedback! If there will be consensus on JEP, I could open one. I have just realised that should have referenced:

JEP 26 https://github.com/jupyter/enhancement-proposals/pull/26, and
this issue: https://github.com/jupyterlab/jupyterlab/issues/2163

Alternatively, I could just comment on the the JEP 26 PR to invite subscribers to comment here.

bollwyvl commented 4 years ago

Thanks for moving this forward @krassowski, and the above for encouragement. Sorry for the somewhat out-of-order replies below, I'm pretty late to the party:

My motivating use case for this project has been:

As a developer, after conda installing one package (or pip and get-node-somehow), be able to perform a full, end-to-end PR for a Lab without leaving Lab.

For me, this will be easiest to use and teach if it is merged into core :grinning:., so I can tell someone conda install jupyterlab-developer. Whatever it takes to get there, I support.

perfect in a months time

I'll take the jab on this one :grimacing: To our credit, we have created a number of upstream PRs to fix things in language servers, packaging of languages servers, and other more esoteric things. This is a pretty big yak.

limited time to offer given the reality

... of, well, reality, for me. Aside from all The Things going on right now, adding LSP to Jupyter is just a rather large goal, where the tolerance by the end user is (or should be) very low for not-exactly-working. We're doing what we can, given we don't control the distribution channel, as VSCode does.

R language would slow down the CI substantially

I may have injected that thought, but really: the core goals of Jupyter is probably served best by the flagship UI expanding its full integration test coverage... and we've already done the work, and have maintainers/users for which it is important. This could be done on merges to master, for example, to lessen the burden. I am all-in on robot framework, despite its quirks, and doubled-down on testing against firefox, which is somewhat at odds with the puppeteer approach in core.

remain committed to supporting this extension for the next 3+ years

I can't necessarily make that claim, but will be doing so to the best of my ability, and the amount of effort and learning I've had to sink into this domain. I desperately want to be able to ship these features for future work and open source projects, which both would strongly motivate my continued involvement... but it's gotta Just Work. Again, we're trying.

JEP 26

Given a gentle, but sadly, non-poem, nudge by @ivanov, I've been pushing forward with one of the ideas we proposed namely, a Language Server Kernel. It replaces our custom REST and WebSockets with kernel comms. PR forthcoming... there's a lot of red on the diff, and I want it passing our rather grueling tests/ci before pushing.

Anyhow, this is really closer to fulfilling the JEP-26, as it proposes reuse of the jupyter kernel spec and machinery (and all the hard work that went into making kernel connections robust in the face of adversity). The approach i have taken uses Comms (like e.g. widgets), rather than a new message type, as was chosen for debug. At present, this one kernel serves all the local language servers (with two threads per server for io), but paves the way for an existing Kernel to add LSP.

To the future! :rocket:

choldgraf commented 4 years ago

Ah and I didn't realize there was already a JEP for this, so I'm +1 on using that JEP to continue conversations etc, and I leave it to you all to decide where the right place for discussion is - whatever ensures that we have input from voices across the community!

MSeal commented 4 years ago

3) Propose it as an official subproject of Jupyter As it fulfills wider Jupyter-ecosystem need and can support multiple frontends.

Given not-yet-mature state, we could live in the Jupyter incubator for some time.

As someone who maintains a lot of the shared code for lower level Jupyter libraries, I'd prefer we make more of the fundamental changes in Jupyter core libraries and not in JupyterLab extensions exclusively when it makes sense. If too many execution related things become JupyterLab specific they create gulfs for other clients (e.g. nbclient, nteract, the many many closed source interfaces) that divide up the ecosystem into working in some places but not the rest. That being said I'd be happiest with a pattern has a dedicated backend handler in /jupyter (or perhaps /nteract where more experimental changes are tried out?). Happy to discuss places that could be made aware of new communication protocols.

From reading the JEP related just now I think this was discussed as a desired outcome. I'll post more directed responses there, but is there more from this issue that would need to be translated over? Seems there's more detail here.

bollwyvl commented 4 years ago

@MSeal very sound concerns!

fundamental changes in Jupyter core libraries

Our currently released server-side approach (e.g. LSP-Proxy-over-bespoke-websockets-and-REST) makes substantial use of existing core Jupyter libraries, and could be run in any single-user notebook server. It follows the manager/handler pattern, leans heavily on traitlets, jupyter_paths, jupyter_core, and notebook, and is mostly a process monitor and dumb JSON pipe... unless it has to be otherwise. For example, we intercept textDocument/didSave to support ipynb in language servers that don't support virtual documents. It's also documented to heck and back, and validates its own (but not LSP) messages with JSON schema, which are compiled to typescript types.

The PoC LSP-Proxy-over-Kernel-Comms approach (binder works, some windows CI hiccups) was a bit hasty (but again, see working binder) is moving in an even more core-forward direction. It keeps the entire manager, removes the dependency on notebook (save for config.d-style loading), while adding an ipykernel dependency and two comm targets, but really could be written atop any kernel.

But more importantly, a Kernel being able to answer as both a kernel and one or more language servers is probably the way forward. Indeed, that idea should potentially be a separate, counterpoint JEP to 26. As a reference implementation of a dumb pipe, it doesn't care about Lab, Monaco, or even web browsers, and would amount to a PR to jupyter_client's spec document:

jupyter.lsp
-----------
The comm target jupyter.lsp is reserved for Language Server Protocol messages.
Kernels MAY choose to implement this spec. If so, the implementation MUST
support messages ``initialize``, ...

gulfs for other clients

To get a to a PoC at all to be having this discussion, we had to build somewhere, and JupyterLab was a) built to be user-extended, and b) supports multiple documents on the screen at once, making it a good fit for the work invested.

The kernel-based approach should Just Work to get the 120+ flavors of LSP messages to to a spec-compliant Kernel Comm object, which I suspect all "full" Jupyter GUI clients already have (to support widgets).

I don't see those 120+ messages changing much to suit the needs of Jupyter: if anything, LSP will just consume the use cases from the messaging spec, but camel case it and add a bunch of opaque enums or something so they won't be compatible with any Jupyter clients OR servers. For that symmetric concern, trying to shoe-horn textDocument/completion into completion_request is probably not worth the effort. So there's not a lot that existing backend machinery can (or probably should do), aside from being the best possible pipe.

So that leaves each client now having to deal with 6x messages than they were, with (best case) some of them handled by an (optimally) first-party codemirror plugin. From a Jupyter perspective, this should be the right way to go. Perhaps in the CodeMirror 6 era (which Jupyter could be be helping more hasten), cooperation at the editor level will make the most sense. Oh yeah, and ProseMirror. We need that. Bad.

Anyhow, the work on jupyterlab-lsp owes a lot to @wylieconlon's lsp-editor-adapter, even if the kernel approach sheds its fork. But, to get to a PoC, it combined many opinions in a single package: we've rebuilt its guts twice, and, as it reuses code from vscode, is non-trivial to build, to boot: we still have to webpack it to get around nodejs hangover packages.

But: not everything in a language server client happens inside a line/character grid. Heck, maybe we should just treat a language server as a widget, which can advertise linked UIs (e.g. i can give you a sortable table of diagnostic listings). Even assuming that, I think contributors to individual clients are still going to have to pull a fair amount of weight, PR by PR, to support as much of the LSP as is important to their audience's use cases.

MSeal commented 4 years ago

Thanks for posting the extra details. That PR is a lot of code to grok so I'm still playing catch up on what it's doing. What that does tell me is that the code is non-trivial to port over existing clients, and the protocol change would be a substantial shift for Jupyter ecosystem.

To get a to a PoC at all to be having this discussion, we had to build somewhere, and JupyterLab was a) built to be user-extended, and b) supports multiple documents on the screen at once, making it a good fit for the work invested.

Fair. Just wanted to make sure there's an understanding about adoption across Jupyter even if the PoC is specific to one tool.

But: not everything in a language server client happens inside a line/character grid. Heck, maybe we should just treat a language server as a widget, which can advertise linked UIs (e.g. i can give you a sortable table of diagnostic listings).

Would worry about a widget solution therein. Widgets have portability concerns across jupyter front-ends and headless execution though it's not an impossible approach.

I need to internalize more of all the awesome work you all have done so far. Seems like an LSP approach would be better in the long term for Jupyter. I posted some more jupyter decision oriented questions on the JEP, so hopefully a framework for what's next steps and decisions that need to be made can pop out of that there.

blink1073 commented 3 years ago

Just catching up on this thread, I support a JEP to make this a Jupyter project. It could live in its own org like xeus.

bollwyvl commented 3 years ago

@blink1073 Thanks for the ping, we had a bit of a breather there, but recently shipped some versions (tested against Lab 2.2.0) and are hopefully preparing a release with .tex support via texlab and some bug fixes. Folks are starting to use this on projects, which is both exciting and terrifying.

I think an initial, purely nominal JEP ceremony around moving the extension sounds good, which could be followed up with later by JEP to codify the kernel/comm behavior. So much to do...

krassowski commented 3 years ago

Just a quick update on this: I am hoping to push ahead a JEP soon. In the meantime we:

released 2.0, started supporting LaTeX and SQL; added improved syntax highlighting for multi-lingual notebooks
migrated to GitHub actions (now in closer alignment with other JupyterLab repos I guess)
I opened an issue on supporting kernel-defined magics for LSP. This is aimed at enabling kernel developers to introduce magics as did IPython without breaking the LSP. Please feel welcome to chime in! While still a rough draft, if the feedback is positive, I will open a JEP to codify this.

goanpeca commented 3 years ago

Awesome :-) !

choldgraf commented 3 years ago

really excited to see this happen 👍

krassowski commented 3 years ago

The JEP PR is now open: https://github.com/jupyter/enhancement-proposals/pull/72.

jtpio commented 2 years ago

Should this be closed, now that the JEP has been accepted and the repo moved to https://github.com/jupyter-lsp/jupyterlab-lsp?

krassowski commented 2 months ago

Reopening the discussion as per https://github.com/jupyter-governance/ec-team-compass/issues/25#issuecomment-1985638756.