Consider ditching concurrent handler execution

ebkalderon commented 3 years ago

Background

In order for the framework to support both standard streams and TCP transports generically, we need a way to interleave both client-to-server and server-to-client communication over a single stream. This topic has been the subject of a previous issue. The current strategy used by tower-lsp (and lspower in turn) to accomplish this is somewhat simplistic:

Incoming messages (requests to server, responses from client) are read sequentially from the input stream
Each message is routed to its respective async handler
Pending tasks are buffered and executed concurrently on a single thread, maximum four (4) tasks at a time, preserving order
Outgoing messages (responses from server, requests from client) are serialized into the output stream

The third step above is particularly significant, however, and has some unintended consequences.

Problem Summary

A closer reading of the "Request, Notification and Response Ordering" section of the Language Server Protocol specification reveals:

Responses to requests should be sent in roughly the same order as the requests appear on the server or client side. [...] However, the server may decide to use a parallel execution strategy and may wish to return responses in a different order than the requests were received. The server may do so as long as this reordering doesn’t affect the correctness of the responses.

This is concerning to me because tower-lsp unconditionally executes pending async tasks concurrently without any regard for the correctness of the execution. The correct ordering of the outgoing messages is preserved, as per the spec, but the execution order of the handlers is not guaranteed to be correct. For example, an innocent call to self.client.log_message().await inside one server handler might potentially prompt the executor to not immediately return back to that handler's yield point, but instead start processing the next incoming request concurrently as it becomes available.

As evidenced by downstream GitHub issues like https://github.com/denoland/deno/issues/10437, such behavior can potentially cause the state of the server and client to slowly drift apart as many small and subtle errors accumulate and compound over time. This problem is exacerbated by LSP's frequent use of JSON-RPC notifications rather than requests, which don't require waiting for a response from the server (see relevant discussion in microsoft/language-server-protocol#584 and microsoft/language-server-protocol#666).

It's not really possible to confidently determine whether any particular "state drift" bug was caused by the server implementation of tower-lsp without stepping through with a debugger. However, there are some things we can do to improve the situation for our users and make such bugs less likely.

Possible solutions

For example, we could process client-to-server and server-to-client messages concurrently to each other, but individual messages of each type must execute in the order they are received, one by one, and no concurrency between individual LanguageServer handlers would be allowed.

This would greatly decrease request throughput, however. Perhaps it would be beneficial to potentially allow some handlers to run concurrently where it's safe (with user opt-in or opt-out), but otherwise default to fully sequential execution. To quote the "Request, Notification and Response Ordering" section of the Language Server Protocol specification again:

For example, reordering the result of textDocument/completion and textDocument/signatureHelp is allowed, as these each of these requests usually won’t affect the output of the other. On the other hand, the server most likely should not reorder textDocument/definition and textDocument/rename requests, since the executing the latter may affect the result of the former.

If choose to go this route, we may have to determine ahead of time which handlers are usually safe to reorder and interleave and which are not. I imagine this would introduce a ton of additional complexity to tower-lsp, though, so I'd rather leave such responsibilities to the server author to implement themselves, if possible.

Below are a few key takeaways that we can start adding now to improve the situation:

Execute server request/notification handlers sequentially in the order in which they were received, no concurrency.
Process client-to-server and server-to-client messages concurrently with each other.

The matters of optional user-controlled concurrency are still unanswered, however, and will likely require further design work.

Any thoughts on the matter, @silvanshade?

ebkalderon commented 3 years ago

I have found some interesting information regarding the Language Server Protocol's consistency requirements that could help inform our future design moving forward.

First, the specification describes the correct server behavior of textDocument/didChange notifications as follows:

\<snip> To mirror the content of a document using change events use the following approach:

start with the same initial content

apply the 'textDocument/didChange' notifications in the order you receive them.

apply the TextDocumentContentChangeEvents in a single notification in the order you receive them.

This is obvious and makes intuitive sense. We must process text document change notifications strictly in the order in which they are received to ensure state consistency on the server side.

We do indeed process these events in order, thanks to Stream::buffered, but there is a risk that multiple notifications may be processed concurrently (because Buffered<St> is implemented in terms of FuturesOrdered<T> which races futures to completion in parallel but yields results in the original order), and as a generic LSP framework, we cannot guarantee that all downstream LanguageServer trait implementers will behave sanely when multiple changes are processed concurrently. As such, we should consider processing all incoming JSON-RPC notifications sequentially (but still asynchronously), or at least make the concurrency level configurable.

For reference, rust-analyzer appears to always process notifications sequentially (see handle_event() & on_notification() in main_loop.rs) while request processing is largely performed in parallel, with the exception of only two standard LSP requests: shutdown and textDocument/selectionRange. This, along with the Notification Message section of the LSP specification, lends credence to the idea that all notification messages should be handled sequentially and in the order in which they are received, while requests may be freely handled concurrently.

samscott89 commented 2 years ago

Hey! Thanks for working on this project. It's super helpful.

I wanted to add some example use cases to this. One thing we want to implement is a custom request that can be used to interrogate the LSP client. For example, suppose we want to ask the client to fetch all workspace symbols across all language. Then we end up with the following flow:

Example protocol flow . Source.

Relevant to this discussion, you can see that within handling the initial request (id=0), a few things happen:

The server makes a custom request back to the client (id=1). Processing this request will actually block handling the initial request (id=0). So request 0 cannot complete until id=1 does.
The client asks ALL language servers for workspace symbols. This includes our server running tower-lsp (that's id=2). So now request 0 depends on request 1 which depends on request 2. The client blocks until id=2 completes on the server (although in vscode for example, this will timeout after ~1s).

So basically, there's a point in time where there are three pending futures on the server:

id=0: the original request from the client
id=1: the server's custom request to the client
id=2: the new request from the client

And these need to be handled in reverse order.

I have this implemented, and it was blocking until I made this change. Which is probably what you would expect to happen:

diff --git a/src/transport.rs b/src/transport.rs
index 0b4a647..69c4802 100644
--- a/src/transport.rs
+++ b/src/transport.rs
@@ -95,11 +95,12 @@ where

         let mut framed_stdin = FramedRead::new(self.stdin, LanguageServerCodec::default());
         let framed_stdout = FramedWrite::new(self.stdout, LanguageServerCodec::default());
-        let responses = receiver.buffered(4).filter_map(future::ready);
+        let responses = receiver.buffer_unordered(4).filter_map(future::ready);
         let interleave = self.interleave.fuse();

Does this make sense? Or am I using custom request wrong? Because without the above change, I'm struggling to see how sending a server->client request could ever make progress?

davidhalter commented 2 years ago

I agree with @samscott89 that having concurrency should probably be possible. However it would probably be nice for some language servers to have no concurrency at all, because that might make the whole thing a bit simpler.

Most language servers will not be able to do a lot concurrently anyway. Calculating diagnostics (which is typically the most expensive operation) can usually not be done at the same time as calculating completions, because they access the same data structures. Although, most of the time caches can be used to avoid recalculating everything.

So IMO unless you have a more fancy architecture like @samscott89 you probably want no concurrency at all and this should probably be the default. However I don't see why users should not also be able to work with concurrency.

ebkalderon commented 2 years ago

Thanks for the suggestions and helpful discussion! However, there is unfortunately trouble associated with the use of buffer_unordered(), as described in commit https://github.com/ebkalderon/tower-lsp/commit/456856b2d114f624d6bb6488ffbcc54e3efef60d which makes it undesirable. The issue is that while client-to-server and server-to-client messages should indeed be processed concurrently and in any order (as the transport is a full duplex channel), the individual client-to-server requests and server-to-client requests must return their responses in the order in which they were received, regardless of whether concurrency is used or not, as per specification. The current design interleaves server-to-client and and client-to-server communication over a single async channel, which means that neither buffered() nor buffer_unordered() will truly yield the correct behavior in this instance.

One possible solution that could solve the problem while preserving the specification invariants might be, as described in the issue description, to have two separate channels for server-to-client and client-to-server communication. Each of these channels would use buffered() by a certain amount to preserve the output ordering of client requests and server requests, respectively. However, these two channels would then be selected upon and process their respective messages without ordering to ensure that server-to-client and client-to-server communication can be buffered independently without either one blocking the other.

This still doesn't solve the data integrity issues caused by allowing servers to have concurrent handlers (which is the original subject of this issue), but it may at least fix the blocking problems @samscott89 described caused by client-to-server and server-to-client requests accidentally blocking on each other, I think.

I do like @davidhalter's suggestion of making the buffered concurrency configurable by the user and possibly disabled by default, though. Perhaps that may be the best route forward: come out of the box with concurrent processing of client-to-server and server-to-client messages, but each direction processing handlers sequentially in the order in which they were received.

davidhalter commented 2 years ago

the individual client-to-server requests and server-to-client requests must return their responses in the order in which they were received, regardless of whether concurrency is used or not, as per specification.

This is not how I understand the spec. The spec says:

However, the server may decide to use a parallel execution strategy and may wish to return responses in a different order than the requests were received. The server may do so as long as this reordering doesn’t affect the correctness of the responses.

I understand that this means that reording/rename requests may not be reordered, but I would say that even there there are exceptions: For example a local definition lookup could in theory be reordered with a rename in a different part of the code. That is still correct behavior according to the specification.

What do you think?

samscott89 commented 2 years ago

I agree with @davidhalter's reading of things. But even so

The server may do so as long as this reordering doesn’t affect the correctness of the responses.

sounds like a brutal invariant to attempt to support generically! So even if technically it should be allowed, I could see a good argument for not supporting it in a higher-level framework like tower-lsp. For my specific use case, it would be more than reasonable to not support handling client requests out of order. But it is still a problem that a server cannot request information from the client during handling a client request. In the above diagram, the original id=0 request will block while waiting on id=1, but id=1 is behind id=0 in the buffer.

As far as I can tell, your proposed solution would achieve that:

However, these two channels would then be selected upon and process their respective messages without ordering to ensure that server-to-client and client-to-server communication can be buffered independently without either one blocking the other.

with that in place, you would see:

the id=2 request from the client cannot complete while the server is still processing id=0. <-- IMO this is a reasonable restriction. Otherwise you need to support infinitely nested client-server interactions.
but the id=1 request from the server to the client would complete before id=0 is finished, which would allow id=0 to continue and compute the response.

Timmmm commented 2 years ago

Sounds like it should be non-concurrent by default with opt-in to concurrency? I can't imagine there are many extensions that would really benefit from concurrency anyway.

ebkalderon commented 2 years ago

That's certainly a possibility, @Timmmm. Though it would be a shame to disallow scenarios such as @samscott89 described in this server. I do feel like this use case should be better supported by tower-lsp without too much fiddling, somehow.

I think I've devised a vague idea for how to proceed, though. If you check the work-in-progress support-custom-server-requests branch opened as part of #256, you may notice that the internal JSON-RPC implementation was completely reworked to support defining custom requests on LSP servers.

I can see a possible future where the interface for registering JSON-RPC methods also requires the user to specify a concurrency mode which essentially tells the Server (or LspService; haven't decided yet) whether to execute the incoming client-to-server request sequentially or concurrently. With this design:

Incoming client-to-server request handlers are marked either "sequential" or "concurrent."
"Sequential" request handlers are executed sequentially and must complete before the next incoming request can be processed, while one or more "concurrent" can be buffered together and executed concurrently to each other without any defined ordering.
- TODO: Should we opt to use buffered() or buffered_unordered() for this? Tradeoff between delayed/ordered vs. rapid/unordered returning of responses.
Responses from the "sequential" request stream and the "concurrent" request stream are serialized into the output stream using future::stream::select() as they become available (along with any server-to-client request messages as well).
Incoming responses from server-to-client requests are always processed immediately as they come in, concurrently with the incoming client-to-server requests.

Whether tower-lsp should ship by default with all standard LSP methods configured to "sequential" or some sensible mixture of "sequential" and "concurrent" based on a conservative reading of the original spec is up for debate. But no matter the approach taken, these settings should probably be customizable by the end user when constructing the Server or LspService (depending on, again, where this request buffering functionality should ideally live).

In the example that @samscott89 gave above, one would explicitly define the custom id=0 request or notification on "Server" to be of the "concurrent" type, which would allow that async task to stick around and wait while the server processes the workspace/symbols id=3 request alongside it and the response to id=1 is received. This is precisely the behavior that we want.

How does this approach sound to all of you? I'd appreciate it if you could try to poke holes in this design.

samscott89 commented 2 years ago

Thanks for the thoughtful response @ebkalderon!

In our case, we ended up scoping back the functionality to avoid needing to do that additional back-and-forth work. I don't know enough about what other LSPs are doing, but I wouldn't be surprised if what I was trying to do has a bit of a "code smell" to it.

If it's not super common, the risk is that supporting these esoteric use cases is going to add a bunch of complexity, so it might not be worth it?

Putting that aside though, what you described sounds like a cool way to make it work :)

Timmmm commented 2 years ago

If I understand it right you're saying commands are executed exactly in order, except if they're marked "concurrent" and then they can be executed concurrently with any commands in the same contiguous group of concurrent commands?

That sounds like a decent plan to me.

ebkalderon commented 2 years ago

Precisely! Indeed, I'm not entirely sure whether the added complexity will be worth it, but I think it's worth prototyping in a branch and mulling over.

For now, in the process of refactoring the JSON-RPC router in the support-custom-server-requests branch of #256, I'm moving forward with the following set of changes from this ticket:

Incoming server requests are buffered concurrently (as they already are today, but with a customizable concurrency level).
Received responses to client requests are processed immediately, rather than buffering them concurrently, to avoid these idle tasks clogging up the fixed size buffer unnecessarily.

Any proposed solution to this ticket (which I feel might even affect portions of #177) will have to come later in a separate release.

ebkalderon commented 2 years ago

Seems like the approach described a few comments up might be a good way to address the &mut methods suggestion mentioned in this Reddit comment by u/matklad.

ebkalderon commented 1 year ago

I've been investigating different approaches to allowing &mut self methods in an ergonomic way, as suggested in the Reddit link above. Here is a Router<S> prototype which accepts method handlers with either &self or &mut self receivers (Playground).

However, due to the unfortunate inability to express the following:

impl<R, O, F, Fut> Method<&R, (), O> for F
where
    R: 'static,
    F: for<'a> Fn(&'a R) -> Fut, // How do I express `Fut` has HRTB 'a here?
    Fut: Future<Output = O>,

the Playground example cannot accept async fn methods at present (see comment in Playground code).

If we abandon this approach and don't attempt to abstract over mutability, we can work around this limitation and reduce a few unnecessary heap allocations too. But this negatively impacts the API ergonomics, where users of LspServiceBuilder are forced to register &self method handlers with .custom_method() and &mut self handlers with .custom_method_mut(), respectively.

I have managed to go beyond the approach described in the Playground link and successfully abstracted the blanket Method trait impls over mutability, but it's pretty complex and I'm looking to simplify the traits further, if possible.

ebkalderon commented 1 year ago

I've found a satisfactory way to work around the above issues with higher-kinded lifetimes (Playground). The gist of it is as follows:

Create a private trait named Closure<Recv, Args, Out> with an invoke() method.
Implement this trait over all async closures F: Fn(Recv, Args) -> Fut where Fut: Future<Output = Out>. This implementation simply calls itself and forwards any arguments: self(recv, args).
Create a public trait named Method<Recv, Args, Out> with a similarly named call() method.
Implement Method<&'a Recv, Args, Out> for all types satisfying for<'b> Closure<&'b Recv, Args, Out> + 'a.
Implement Method<&'a mut Recv, Args, Out> for all types satisfying for<'b> Closure<&'b mut Recv, Args, Out> + 'a.
You should now be able to pass async methods taking either &self or &mut self receivers as arguments to functions expecting F: Method<Recv, Args, Out>.

This hack relies on this intermediate Closure trait to express the higher-kinded lifetime bound previously considered inexpressible.

This took me an incredibly long time to figure out, bumping up against limitations such as https://github.com/rust-lang/rust/issues/70263 frequently along the way. I'm pretty happy with the ergonomics of the end result!

ratmice commented 1 year ago

FYI @ratmice, I hope to relax the Send bound requirements on LanguageServer futures eventually as part of #284 (comment), thereby supporting thread-local futures out of the box. The Client type will remain Send, of course, in case anyone wants to pass it into a Send task.

Responding here, because it probably makes more sense.

I've been doing a bit of refactoring in the code for my project (from which https://github.com/ebkalderon/tower-lsp/pull/340 was a simplified example). Having done so, I kind of doubt now that i'm actually going to be able to meet these (or any!) trait bounds, the primary issue underlying it being self references in the thread which I currently have to spawn to respond to events.

I'll definitely try and check out this HKLT, when there is a branch to look at, but the the feeling I get is that even with these trait bounds relaxed, it is likely I don't have a struct to implement satisfy them.

ebkalderon commented 1 year ago

With regards to the simplified example in #340, dropping the Send bound on all futures indeed allows !Send struct fields, and these values may also be borrowed across .await points in LanguageServer method handlers (Playground link)[^1]. This enables servers to call out to Client at any time while holding references to !Send data.

Additionally, the current push to mark certain methods that inherently depend on exclusive access to server state &mut self (e.g. initialize, shutdown, and the did_.* notifications) should make the API easier to reason about as well.[^2]

It's still perfectly fine to spawn background tasks to tokio::spawn() or std::thread::spawn() inside any of these handlers, of course. Though if you do, you will need to make sure the data passed in and out of the task it is Send + 'static like always.

[^1]: In reality, tower_service::Service requires that all futures be at least 'static, unfortunately (see this comment on tower-rs/tower). The language server instance could be wrapped in an Rc/Arc internally to satisfy this requirement, though, which is exactly what tower-lsp does today. [^2]: Note that the use of tokio::sync::RwLock in the playground example to support &mut self methods is just for demonstration purposes. The real implementation I'm working on uses a fast async-aware semaphore instead of locks.

ebkalderon commented 1 year ago

Okay, I've pushed up a quick and dirty implementation to the support-mutable-methods branch. Haven't stress-tested it much yet, but all tests pass. Would appreciate any feedback from folks!

ebkalderon commented 1 year ago

An open question regarding the implementation in the above branch: should the future returned by Server::serve() be Send? The current work returns a thread-local !Send + !Sync future, since we assume most users will not attempt to spawn() their server, but will instead be .await-ing it on the main thread. This lets us use a lightweight async semaphore + RefCell to synchronize access to the inner LanguageServer instead of an async RwLock (which may or may not be tied to a specific async runtime, something we don't want).

If anyone here depends on Server::serve() being Send, please speak up now!

ratmice commented 1 year ago

I am intending to try this out ASAP however, i've been a bit under the weather hopefully this weekend over the next few days I'll feel well enough to sit down and work on it. I'm definitely just .await-ing in the main thread though.

ebkalderon commented 1 year ago

Sure thing. Sorry to hear you're under the weather, @ratmice. Hope you feel better soon!

Thanks for chiming in about your own usage of the Server::serve() future. Would love to hear other folks' feedback too. I may let this thread simmer for a bit longer and wait for other responses to trickle in.

ratmice commented 1 year ago

@ebkalderon It seems like it is going to take me a while to do full port of my lsp server so it can make full usage of the new branch, internal refactoring required to do is is quite large. Instead I had tried to do a little proof of concept port from scratch but starting out with just the example in the README.md on the branch, but ran into this error with Send bounds (for initialize, initialized, and shutdown. Curious if others can reproduce or if the basic ReadMe example from the branch works for others?

error[E0053]: method `initialize` has an incompatible type for trait
  --> server/src/main.rs:12:5
   |
12 |     async fn initialize(&mut self, _: InitializeParams) -> Result<InitializeResult> {
   |     ^^^^^ expected trait `Future<Output = Result<tower_lsp::lsp_types::InitializeResult, tower_lsp::jsonrpc::Error>>`, found trait `Future<Output = Result<tower_lsp::lsp_types::InitializeResult, tower_lsp::jsonrpc::Error>> + Send`
   |
   = note: expected fn pointer `fn(&'life0 mut Backend, tower_lsp::lsp_types::InitializeParams) -> Pin<Box<...>>`
              found fn pointer `fn(&'life0 mut Backend, tower_lsp::lsp_types::InitializeParams) -> Pin<Box<...>>`

ebkalderon commented 1 year ago

Interesting, I haven't run into this issue while working on the branch. CI hasn't indicated any errors when checking the in-repo examples either. Do you think you could push up your WIP work to a new GitHub repo so I can take a look with you?

ratmice commented 1 year ago

Curious I should look more closely at the in repo examples to see if I can spot a difference. I went and pushed the example I encounter the error with here, https://github.com/ratmice/nimbleparse_lsp_mutable_methods with the example code in server/src/main.rs.

Edit: Looks like the readme might just be missing the #[tower_lsp::async_trait(?Send)] bounds

ebkalderon commented 1 year ago

Yes, that's absolutely correct. All examples and doc tests have switched to #[async_trait(?Send)], but it seems I forgot to update the README accordingly. :+1: With that in place, the example in the linked repo executes fine for me indeed.

ebkalderon commented 1 year ago

One small change that I'm less than happy with, introduced as a result of this work, is the new method signature of LspService::inner(). Where previously it had returned a &S where S: LanguageServer, it now returns an std::cell::Ref<'_, S> guard instead. This minor wart is unfortunately unavoidable because of the internal immutability added to LspService in order to support &mut self methods.

Perhaps research into #355 will yield a nicer API for testing servers (which is what this method is primarily intended for), which would let us remove the .inner() method altogether. With that said, I will happily live with this for now.

JoshuaBatty commented 11 months ago

Hey @ebkalderon, just wanted to see what the status is with making this change to tower-lsp. I just modified our server to target the support-mutable-methods branch to see how the new changes feel.

It's so nice to have a clear boundary where the internal server state can be mutated safely and where it should be taken by reference. Not needing to wrap everything in a RwLock or similar really is a massive ergonomic win. Would love to see this change land in an official release. How are you feeling about it? Is there something holding you back? Is there anything you need help with? Keen to lend a hand if it helps move this along.

ebkalderon commented 11 months ago

Hi there, @JoshuaBatty! Thanks for your interest in this issue. I think the support-mutable-methods branch is largely ready to go in its current state. However, there are a few complications currently blocking its immediate release.

There doesn't appear to be a consensus among downstream projects whether this change is desirable. Considering tower-lsp is being used by several large projects at this time, I would prefer not to break API stability too egregiously (though I'm not afraid to do so, if I must). I would love to collect some feedback from at least the top-largest key players before merging this branch in, but I'm not sure how to best go about this yet.
Some folks have expressed a desire for concurrent execution of request/notification handlers to remain configurable, if not the default (e.g. see comments by @dsherret on #386). Being able to select between &self and &mut self method receivers in the current trait-based LanguageServer API is unfortunately not practical, so this would require a deeper API redesign which I suspect would, again, disrupt downstream users quite a bit.
On the same topic, it's possible that we might need to migrate away from the monolithic LanguageServer trait API regardless in order to resolve #386 (making &self and &mut self concurrency user-configurable) and #393 (allow overriding default method signatures). I have promising ideas on this front, but pursuing this would make the support-mutable-methods branch obsolete if I went in this direction.

Hope this sheds some light on the current state of things! Ideally, I would love to be able to release the support-mutable-methods branch ASAP, to improve ergonomics for users in the short-term, and then release a more flexible redesigned API later. I worry about how downstream projects would react to this increased level of churn, though.

I'd love to hear thoughts from anyone (the biggest tower-lsp users out there especially) on what would be the best path forward!

ratmice commented 10 months ago

I would love to collect some feedback

While not a large user (I've only communicated with one person besides myself that I know has attempted to use my lsp) , my feedback from my experiment was that it did work and appeared it would allow me to avoid making separate threads and serializing lsp messages via channels to those threads.

Unfortunately in order to leverage it required the use of self-referencing structures (which I am somewhat hesitant to do). If it wasn't for that, it should be a no-brainer for my project. But at the time it seemed like trading the mess I have (which works ok for now) for some ownership messiness of a new kind. So, I wasn't quite sold yet on the full rewrite that would allow it to leverage the new things which would only be possible with the mutable-methods branch. I haven't done an update to my current code base to the mutable-methods branch (not really leveraging the new possibilities), but would be happy to do that if it would be helpful.

https://github.com/ratmice/nimbleparse_lsp_mutable_methods/blob/30ee8436ba28bb174f8deaaa53becb833035c896/server/src/main.rs#L130C1-L154C1

ebkalderon commented 10 months ago

Good to know that the branch works as expected, @ratmice! Thanks for the input. I presume the need for self-referential structs is a specific requirement of the specific language server being built and not an issue with the branch, as I have never encountered such a need in other downstream projects relying on tower-lsp AFAIK, but please correct me if I'm wrong there.

ratmice commented 10 months ago

Yeah, it is a specific detail of the language server being built, the current implementation only avoids self-referential types because because it is borrowing local variables within that threads main loop, but once it has to be within something which implements a trait it needs to become a self-referential type.

Edit Where on the current master I just couldn't implement the trait due to bounds requirements (IIRC).

JoshuaBatty commented 10 months ago

Thanks for the detailed response and shedding light on the complexities involved. I genuinely appreciate the effort you're putting into maintaining the stability and usability of tower-lsp.

From our standpoint, the benefits of the support-mutable-methods branch are immediately tangible. The ergonomics it offers are massive for us, and having that in an official release would significantly enhance our development experience.

I understand the concerns regarding the potential API redesign and how it might introduce additional churn. While long-term stability is essential, I'd argue that we should prioritize immediate benefits that have a clear positive impact. We're willing to adapt and navigate through any churn that comes with future redesigns.

ebkalderon commented 10 months ago

@JoshuaBatty Thanks for the helpful reply, especially regarding whether the branch works as expected and how getting it merged would likely impact your project. I'm glad to hear folks seem generally in favor of it getting merged so far.

I think I would like to hear from folks working on the Deno project, particularly from @dsherret and @bartlomieju (sorry to ping you both directly) over on https://github.com/denoland/deno/issues/18079, which had spawned further discussion in #386. Would merging in the support-mutable-methods branch break anything over at Deno today?

oxalica commented 10 months ago

Not sure if it helps, but I'm trying an alternative design leveraging the interface of tower::Service for all methods in my project async-lsp. That are request handlers fn(&mut State, Params) -> impl Future + 'static and notification handlers fn(&mut State). This approach has the downside of incompatibility with async fn sugar, forcing mutations being done before computations, but simplifies the concurrency implementation. We can just call the handler, spawn the returned future and repeat, then everything is correctly ordered and asynchronous computations are fully concurrent.

The current approach of support-mutable-methods uses async fn(&mut State, Params) aka. fn<'a>(&'a mut State, Params) -> impl Future + 'a, which holds the write lock to State during the task duration. It means the frequent did_change handler becomes the join point and concurrency can only happen between two keystrokes. This may already be the case for some language servers, but when State is cheap to snapshot/clone, concurrent read and write are possible and can be more efficient.

But well, yes, it's more ergonomic than both the current APIs and my approach.

dsherret commented 10 months ago

I think I would like to hear from folks working on the Deno project, particularly from @dsherret and @bartlomieju (sorry to ping you both directly) over on https://github.com/denoland/deno/issues/18079, which had spawned further discussion in https://github.com/ebkalderon/tower-lsp/issues/386. Would merging in the support-mutable-methods branch break anything over at Deno today?

I don't think it would break anything because we wouldn't use (?Send) on the trait so the interface should remain the same, correct? I think we'll just continue to use the current interface and do some hacks to make it work better with a single threaded runtime. I'd rather we control the concurrency and mutability for the reasons outlined in the other issue.

JoshuaBatty commented 10 months ago

Sounds like we might be able to move forward with this then? How are you feeling about it @ebkalderon?

JoshuaBatty commented 8 months ago

Hey @ebkalderon sorry to be pushy, any update on your thoughts for moving this ahead?

micahscopes commented 6 months ago

I'm building a language server and recently prototyped a tower-lsp rewrite. So far I really like the API.

However, this issue has come up in a concrete way: It's hard to understand the order in which did_close, did_open and did_change_watched_file are executed after a file rename in vscode. In the lsp-server implementation did_close was handled first, whereas in the tower-lsp implementation prototype did_open is handled first (seemingly).

I need to ensure that any references to the old filename get updated before running diagnostics... when I open the "new" file, I need some way of knowing that it's actually just the old file with a new name. Since it's hard to predict which handler will be executed first it seems that the workaround will be to clean up deleted file references for both did_open and did_change_watched_file.

Now in this case it's probably not too big of a deal but the solution feels hacky and in the case of more complicated combinations of long running tasks I'll have to get creative to avoid redundant execution of expensive stuff.

(I still need to try will_rename, hopefully that will fix everything in this specific case since the client literally waits for a response to that before executing the rename...)

In general, it'd be nice to have insight into and control over the handler execution order, especially in cases where clusters of interrelated requests come in quickly, for example to explicitly debounce sequences of interdependent tasks that could be triggered by multiple events concurrently, or to throttle expensive tasks. Maybe I can use streams and stream extensions for this? The LanguageServer trait implementation handlers can directly handle certain things if there isn't a potential for interference, but in other cases maybe they could defer to some more complex logic by sending events to channels that get processed as streams. In that case, what would be the advantage of using tower-lsp over lsp-server? The easy to use API is definitely one advantage. Maybe there are others?

When it comes to rust async stuff I'm definitely "Alan", so it's hard for me to grasp whether the support-mutable-methods changes can resolve my issues.

fda-odoo commented 1 month ago

Hello, I would like to make the point on this issue. We are using TowerLSP for our project, and we are wondering about the message ordering. As I understand, the project is actually synchrone as far as you don't release the task with an await or similar call? This is a real issue for us, as some methods have to call customs routes and wait for the response of the client, making the server unstable. As the specs indicates, Completion or signatureHelp can be reordered, but reordering did_change notification is totally forbidden, especially if we use incremental file cache and not the TextDocumentSyncKind::FULL option. Actually my plan is to list all method that can be reordered and ones that can't, and check before any execution that no other 'not-reorderable' function is executing. Do you think it could temporary solve the issue with the actual version of TowerLSP?

EDIT: It won't, because if some tasks are blocked, I can't guarantee which one will wake up first

ebkalderon / tower-lsp