Language server concurrency and functionality upgrades

micahscopes commented 7 months ago

This PR introduces significant changes to the language server, focusing on improving concurrency and LSP functionality. The language server has been rewritten based on tower-lsp to support concurrent execution of tasks.

Here's a detailed overview of the key components and their interconnections:

`server` module

Server: This struct handles the I/O of the Language Server Protocol (LSP). It receives requests and notifications from the client and sends responses and notifications back to the client. The Server struct uses the tower-lsp crate, which provides a clean and easy-to-use API for implementing LSP servers. Because state is omitted from the Server struct, it functions essentially as an async task manager for LSP events.
MessageSenders and MessageReceivers: These structs are generated by a procedural macro in language-server-macro using information from the tower_lsp::LanguageServer trait implementation. They are used for communication between the Server and Backend via tokio MPSC and oneshot channels.

`backend` module

Backend: This struct has exclusive ownership of language server state, which is stored in LanguageServerDatabase and Workspace members, and mediates all state modifications. Struct methods defined on Backend are used to handle stream events set up in the streams module.
Workspace: This struct has been refined to avoid excessive reliance on mutable references. It represents the current state of the user's workspace, including the open files and their contents. It functions as an index of salsa inputs to be used with LanguageServerDatabase.
LanguageServerDatabase: This struct provides access to functionality in compiler crates. It implements Salsa's ParallelDatabase trait to enable parallel salsa queries via the Snapshot mechanism.

`functionality` module

functionality::streams: This module sets up streams for handling LSP events on the backend side. It uses the tokio::select! macro to concurrently listen to multiple streams and handle events as they arrive. Streams allow for declarative control over the order in which LSP events are processed and concise expression of additional reactive behavior. I'm especially happy with the chunked+debounced diagnostics stream and its handler, to give an example.
functionality::handlers: This module contains functions for handling different LSP events. Each function corresponds to a specific LSP event and contains the logic for processing that event and producing an appropriate response.
- Diagnostics are handled in on separate worker threads in the function functionality::handlers::Backend::handle_diagnostics
The other modules in functionality support implementations of various LSP functionality, including improved go-to definition, hover info and diagnostics functionality

This architecture is designed to handle concurrent execution of tasks efficiently and to provide a clean separation between the I/O handling (Server) and the actual processing of LSP events (Backend). This separation makes concurrency easier to reason about and allows for parallel execution of expensive tasks. The use of tokio channels and streams allows for efficient handling of concurrent tasks and provides control over the order of task execution.

Changes and features

[x] Async handlers for LSP events (requests + notifications) via the tower-lsp crate
[x] Separation of LSP server I/O (Server) and language server state (Backend) via tokio channels
- Using a procedural macro to generate channels
- Broadcast channels to send out LSP events
- Oneshot channels are sent along with LSP requests to facilitate responses
- Tokio channel separation is optional, events can be handled directly in tower-lsp interface if needed
[x] Proof of concept stream-based LSP event handling on the backend side
- Intended for dealing with LSP event handler execution order, cancellation, and debouncing of expensive event handlers
- [x] Simple use case: aggregating document updates and handling them with a single on_change handler
- See tower-lsp issue #284 for examples of more complex scenarios that could arise
[x] Separate tokio executors (worker pools) for server vs backend contexts (proof of concept, could be leveraged to keep long running tasks separate from LSP I/O)

More changes, following the initial review

[x] Refactor to avoid mutable references when not exclusively modifying salsa inputs, generally avoiding mutable references to language server db outside of document change handler
- [x] Modify salsa inputs explicitly in a single step without doing other stuff
- [x] Remove salsa input modifications from handlers like hover, ensure they are updated in the change handler
- [x] Remove diagnostics storage in the language server db, diagnostics shouldn't need a mutable reference
[x] Refactor to avoid using broadcast channels by default, only use them if broadcasting/forking is strictly necessary
- [x] Use mpsc channels by default, then consider stream splitting, and use broadcast channels as a last resort
- Consider these crates for multi-use streams:
- [x] Send raw oneshot channels instead of wrapping them
[x] Review stream forking, avoid unnecessarily forking
[x] Refactor stream handler configuration
- [x] One single spawned select! loop for mutating handlers and multiple spawns for read-only handlers
- [x] Potentially move read-only select loops to a separate executor to enable parallel execution
[x] Switch to tracing for logging to ensure logs are correctly ordered
- [x] Setup formatting
- [x] Setup console_subscriber for tokio console support
[x] Document broadcast channel / stream architecture and associated proc macro
[x] Clean up code organization and module naming
[x] Refine and document stream setup for controlling order of execution for tasks triggered by LSP events
[x] Give Backend exclusive ownership of workspace and database state; no locks needed
[ ] Add tests
[ ] Investigate and implement a cancellation mechanism for potentially long running processes, both on the language server and in the compiler

Functionality upgrades

[x] Goto works with intermediate path segments
[x] Useful hover information
- [x] render doc comments as markdown
- [x] target origin
- [x] definition source
[x] non-blocking, multithreaded diagnostics
[x] batched, deduplicated ingot wide diagnostic handler
[x] Improved VS Code extension support for stuff like comment toggling and autoclosing brackets

Not urgent/maybe

[ ] Ensure WASI target compiles
- [ ] Replace WASM test target with WASI test target
[ ] automatically send LSP events through channels; would require modification to tower-lsp
[ ] Performance profiling tasks

Initial impressions (previously)

Update: the uncertainties below have been addressed by a channel/stream augmentation to the tower-lsp implementation

Managing request handler execution order

tower-lsp doesn't really provide a way of managing the order of handler execution. Neither does lsp-server for that matter. What exactly should this execution dependency graph look like in various cases? It's hard to foresee what this will look like as more and more LSP functionality gets implemented, but it's clear that we need a way to control the order of task execution somehow or another.

As a simple example, if I were to rename a file in vscode it could trigger multiple handlers (did_open, did_close, watched_files_did_change) concurrently.

How can we ensure that this pair of LSP events gets handled appropriately? We're not just opening any file, we're opening a renamed file and we need to ensure that the workspace cache is updated to reflect this before executing diagnostics. The watched_files_did_change handler ends up being redundant in this case if it gets executed after the did_open handler, but in the case where a file gets directly renamed outside of the LSP client, it's still important.

In this example, it's not a big deal to check for deleted (renamed) files in both handlers, since those checks are relatively short lived given how infrequently they'd be executed. But it'll be important to ensure that e.g. diagnostics are run carefully, that the salsa inputs are set up correctly before running diagnostics or other complex tasks. It's also important to ensure that expensive tasks aren't triggered redundantly in parallel.

Shared state and deadlocks

Related to the issue of concurrency... How do we manage sharing of the salsa database and inputs cache in a concurrent environment?

In this prototype the salsa db and workspace cache are currently shared via std::sync::Mutex locks and the language server client is shared via tokio::sync::Mutex locks. This works just fine but it requires care not to cause deadlocks. The tokio shared state docs were very helpful for me in understanding this better.

Useful info

micahscopes commented 6 months ago

Notes from review with @sbillig and @Y-Nak:

avoid mutable references when we aren't exclusively modifying salsa inputs
salsa inputs should be modified explicitly in a single step that doesn't do other stuff, e.g.
- don't modify salsa inputs in the hover handler, they should already be updated from the change handler
- shouldn't need to store diagnostics on the language server db, diagnostics shouldn't need a mutable reference
- generally we shouldn't see mutable references to language server db outside of document change handler
- this will avoid excessive cache invalidation and free up multithreaded computation either using a rwlock or salsa's snapshot mechanism
salsa's snapshot mechanism is similar to a rwlock but avoids deadlocks in case of cycles
avoid using broadcast channels by default, only use them if broadcasting/forking is strictly necessary
- maybe we can think in this order of preference: using mpsc channels should by default; then some kind of stream splitting mechanism (potentially using broadcast channels but guarding access to create extra receivers); then broadcast channels
- this will allow just sending the raw oneshot channel instead of wrapping it
try to avoid forking streams unnecessarily; be intentional about forking them
multithreaded executor is unnecessary for the current stream handler configuration (one loop with a big select! statement containing all the stream handlers)
- stream handler will be more i/o bound once we get the multithreaded salsa queries are sorted out
- need to think more carefully about how to break stuff up
- maybe we can have one select loop for mutating stream handlers and spawn separate select loops (on a separate executor?) for read only stream handlers, that way we can get all the read only stuff done in parallel and free up the db lock as soon as possible
use tracing for logging, it will help guarantee that logs are correctly ordered

Y-Nak commented 6 months ago

salsa's snapshot mechanism is similar to a rwlock but avoids deadlocks in case of cycles

This is not correct. What Snapshot does is almost the same as what RwLock does, so it's possible to introduce a deadlock either way. But it's rather difficult to cause a deadlock as long as we don't try to mutate the db from the salsa tracked function (e.g.,) by sending an event to the main thread to let the thread mutate the db. E.g.,

#[salsa::tracked]
// Sender needs to implement `Clone` and `Hash` to be an argument of
// a salsa-tracked function, but I ignore the fact for simplicity.
fn rename(db: &dyn Db, rename_event: Event, tx: Sender ) {
      // Perform Reneming.
      // ...

      // Send an event to the main thread, and the main thread will try to mutate the database. 
      // This might cause a deadlock.
      tx.send(Event::SourceTextChanged) 

     // ...
}

Another possibility for the deadlock is not related to mutability thing, i.e., the deadlock situation might happen even if we only use &Db in muti-thread settings. But salsa detects this deadlock situation and raises a cycle error, which is, of course, nice.

Please refer to the below links for more information.

ethereum / fe