Unknown message id when using STDIO but not with TCP

ipa-danb commented 2 years ago

I have a bit of unexpected difference in behaviour between STDIO and TCP Server.

I built upon the JSON example, but i have a custom command that triggers a rebuild of the completion/hover/goto options. This command takes a while to run (~20-30 s).

With the TCP server there is no issue at all, the server builds the new options and then continues to give suggestions. But with the STDIO server, no more suggestions are given after rebuilding - the communications breaks down (the server still runs and responds to requests, apparently, see logs below).

The error is: WARNING:pygls.protocol:Cancel notification for unknown message id "7"

The Implementation is pretty simple and straightforward:

@self.thread()
@self.command(MyLanguageServer.CMD_RELOAD)
def reload(ls, args)
   for path  in ls.workspace.documents.keys():
       self.rebuild(path)
   self.update()

Im a bit confused about what the exact issue is and on the other hand id expect the STDIO and TCP communications to behave the same.

Maybe somebody has some pointers about that?

Logs for TCP and STDIO

``` INFO:pygls.server:Starting TCP server on 127.0.0.1:2087 INFO:pygls.protocol:Language server initialized (...) INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 0, "result": ... INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "method": "window/showMessage", "params": {"type": 3, "message": "loading now ..."}} INFO:root:trying to load INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "method": "window/showMessage", "params": {"type": 3, "message": "done loading..."}} INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 1, ... INFO:root:trying to parse INFO:root:... success INFO:root:trying to save INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 2, "result": null} INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 3, "result": null} INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 4, "result": ...} ---- INFO:pygls.server:Starting IO server INFO:pygls.protocol:Language server initialized (...) INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 0, "result": ... INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "method": "window/showMessage", "params": {"type": 3, "message": "loading now ..."}} INFO:root:trying to load INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "method": "window/showMessage", "params": {"type": 3, "message": "done loading..."}} INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 1, "result": null} INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 2, "result": null} INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 3, "result": null} INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 4, "result": ... INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 5, "result": ... INFO:root:trying to parse INFO:root:... success INFO:root:trying to save INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 6, "result": null} INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 7, "result": ... WARNING:pygls.protocol:Cancel notification for unknown message id "7" INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 8, "result": ... WARNING:pygls.protocol:Cancel notification for unknown message id "8" INFO:pygls.protocol:Sending data: {"jsonrpc": "2.0", "id": 9, "result": ... WARNING:pygls.protocol:Cancel notification for unknown message id "9" ```

alcarney commented 2 years ago

I also see these messages from time to time in esbonio, but as far as I can tell the server still performs as expected so unfortunately, I’ve never dug into it that much….

Interesting that it only seems to happen on stdio… I assume there’s nothing in the implementation of your reload command that would accidentally print to stdout? I’ve found that tends to break the communication between client and server

ipa-danb commented 2 years ago

I also see these messages from time to time in esbonio, but as far as I can tell the server still performs as expected so unfortunately, I’ve never dug into it that much….

Interesting that it only seems to happen on stdio… I assume there’s nothing in the implementation of your reload command that would accidentally print to stdout? I’ve found that tends to break the communication between client and server

Indeed that seemed to have been the problem -_-. Thanks so much for your hint @alcarney !

Although it kind of surprises me that this breaks the communication completely. I'd expect that the client would just ignore the unknown message. (i know that this is not a pygls error)

Other than that, the "Cancel notification for unknown message id "x"" is a bit cryptic for me. Id expect the server to know about the message, as it sent it right before.

tombh commented 2 years ago

So this isn't necessarily a bug? If I'm understanding right, when the server is being communicated with via STDIO and the server emits, let's say, an accidental debugging line sent to stdout, then the client interprets it as a formal protocol message? In this case the debugging output was interpreted as a request to cancel a progress notification.

I don't think there's any reasonable fix for such behaviour. It sounds like the server and client will at least attempt to continue. But if a debugging message appears similar enough to an actual command then I don't suppose there's anything the client can do to make the distinction between a real and accidentally similar debugging message.

dgreisen commented 2 years ago

Perhaps just need to solve this with a sufficiently in-your-face warning in the docs?

On Mon, Jul 4, 2022 at 11:03 AM Thomas Buckley-Houston < @.***> wrote:

So this isn't necessarily a bug? If I'm understanding right, when the server is being communicated with via STDIO and the server emits, let's say, an accidental debugging line sent to stdout, then the client interprets it as a formal protocol message? In this case the debugging output was interpreted as a request to cancel a progress notification.

I don't think there's any reasonable fix for such behaviour. It sounds like the server and client will at least attempt to continue. But if a debugging message appears similar enough to an actual command then I don't suppose there's anything the client can do to make the distinction between a real and accidentally similar debugging message.

— Reply to this email directly, view it on GitHub https://github.com/openlawlibrary/pygls/issues/240#issuecomment-1173917624, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAD3MVJOOSVGHKP34OGRRFTVSL4NBANCNFSM5YPCWO7A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

alcarney commented 2 years ago

I think there are two issues here

Accidental prints to stdout when the client is communicating with the server over STDIO simply breaks the connection as the client loses track of where it is in the stream and cannot parse any more messages (or at least it appears that way).

In this situation there's nothing to fix, but maybe adding a warning in the docs along the lines of "never print() when using STDIO" is a good idea.
The other issue is that, even during normal operation over STDIO pygls is unable to cancel requests and emits warnings like Cancel notification for unknown message id "536" - I think there is something worth at least investigating there, but the severity of the issue seems low since I see these all the time in esbonio and they've never prevented it from working correctly.

tombh commented 2 years ago

Oh right, that's a bit more significant than I thought then. I suppose one idea at least is to patch something like sys.stdout.write with a simple regex to exclude everything that doesn't look like a command?
This issue is likely not from errant debug messages right? If it becomes more serious we'd have to start from scratch to figure out what's going?

alcarney commented 2 years ago

I suppose one idea at least is to patch something like sys.stdout.write with a simple regex to exclude everything that doesn't look like a command?

Maybe? Though I'm inclined to think it's more trouble than it's worth...

This issue is likely not from errant debug messages right?

Correct.

The message seems to come from this line, so my guess is either something funny is going on with the book keeping of futures, or perhaps a timing issue where the cancel request comes in just as the response has been sent?

perrinjerome commented 2 years ago

Oh right, that's a bit more significant than I thought then. I suppose one idea at least is to patch something like sys.stdout.write with a simple regex to exclude everything that doesn't look like a command?

Maybe once transport is created ( here ) it's possible to replace sys.stdout by a file that would emit to logging system ? I also feel that this should be addressed with a big warning in documentation, because even if we can replace sys.stdout, subprocess would anyway output to "real" stdout.

perhaps a timing issue where the cancel request comes in just as the response has been sent?

I looked at this some time ago, pygls process messages one by one as they arrive (at least when message handlers are implemented with async def methods, maybe it's different with threads), so the sequence is something like this:

Client sends request X to server
Server processes request X
Client sends a notification to cancel request X
Server processes notification, but the request is already cancelled, and this warning is logged.

I tried to address this and prototyped an implementation where the messages received from the client are put in a queue that understands cancel notification and removes the message from the queue in that case - in a way similar to what's done in https://github.com/microsoft/vscode-languageserver-node/blob/dca3c1a04797aaaa7d48060cca38b16efdf4392f/jsonrpc/src/common/connection.ts#L608-L637

If you are interested, I can find the time and energy I'll study this more, clean up the patch a bit and send a draft pull request with this.

What I can confirm is that in stdio mode, printing on stdout break the client/server communication, that Cancel notification for unknown message id ... happens a lot and they are mostly harmless (except that the server does some useless computation) and that these are two different problems.

tombh commented 2 years ago

Maybe once transport is created ( here ) it's possible to replace sys.stdout by a file that would emit to logging system ?

Interesting idea. I'm not clear on whether replacing sys.stdout would break other things? If sys.stdout is the most primitive access to STDOUT then maybe even the server itself needs it to work as expected, even if it has a copy of it or something. I was thinking more along the lines of intercepting it, so that obvious debugging messages could be dropped or sent to STDERR. In fact that gives me another idea, how about making print() output to STDERR?

Anyway, yes just making all this clear in the docs is the main things we need to do.

So the separate issue of the out-of-order behaviour for cancel notifications. Yes that'd be great if you find the energy.

So shall we close this issue, add a warning to the docs, and make 2 new issues, just to track them for now seeing as they're not high priority?

openlawlibrary / pygls

Unknown message id when using STDIO but not with TCP #240