Closed gitonthescene closed 6 months ago
The issue looks like the record command is struggling to process the messages being captured by the agent.
Failed to handle notification "message/text": MessageText(text='Content-Length:
162\r\n\r\n{"jsonrpc":"2.0","method":"textDocument/semanticTokens/full","params":{"textDocument":{"uri":"file:///Users/XXXX/K/ngnk/ng
nk-lsp/unitest.k"}},"id":27}\n', timestamp=1709523564.392576, session='df629c07-61ae-43a1-817d-d3ce5d44830b', source='client')
Traceback (most recent call last):
...
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 192)
What seems suspicious is the newline at the end of the MessageText
payload, I would normally expect the text to end at the final }
character...
Content-Length: 162\r\n\r\n{"jsonrpc":"2.0","method":"textDocument/semanticTokens/full","params":{"textDocument":{"uri":"file:///Users/XXXX/K/ngnk/ngnk-lsp/unitest.k"}},"id":27}\n
However that on its own isn't enough to trigger the error... but if I add some extra text I can reproduce it
>>> json.loads('{"jsonrpc":"2.0","method":"textDocument/semanticTokens/full","params":{"textDocument":{"uri":"file:///Users/XXXX/K/ngnk/ngnk-lsp/unitest.k"}},"id":27}\nContent-Length: ')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python3.12/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.12/json/decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 151)
I don't know exactly where the issue is coming from, but I have a hunch that the Content-Length
header is being calculated incorrectly and is causing the message parsing code to read too many characters from the stream... :thinking:
>>> len('∊')
1
>>> len('∊'.encode())
3
(Thinking aloud here)
The issue is probably here
logger.info(
"%s",
message.decode("utf8"),
extra={"source": source},
)
The message is decoded from bytes to a utf8 string which "shortens" a character like ∊
which invalidates the Content-Length
header associated with the message... which is never corrected when the captured message is converted to a MessageText
notification
message = MessageText(
text=record.args[0], # type: ignore
session=self.session,
timestamp=record.created,
source=record.__dict__["source"],
)
What I can't quite explain to myself yet is why the error in your test.script.txt
file is for a semantic tokens request (which as far as I know will not contain any unicode) and isn't for a message like textDocument/didOpen
which would contain the contents of your file.... :thinking:
I think it's probably because it's a textDocument/semanticTokens/full request which sends the whole document along.
The fix for this should now be available in v0.2.3
of lsp-devtools
. It was a fairly major change compared with how things worked previously, so let me know if you run into any issues :)
Awesome! Will try shortly.
I'll try to debug further, but when I try to use
lsp-devtools record --to-file
to record a session connecting my LSP server to a client file which contains unicode I get a bunch of errors and the record session seems to effectively bail. The server itself seems to keep running. I've attached a script session of attempting to record, the output and the file. It does use my lsp-server but I suspect that's not relevant.The file simply contains
(∊)
. Note that the output (test.json
) doesn't record past theinitialized
message though I interacted with it further before shutting it down.I'm happy to poke around further but it might be easiest to for you to try to reproduce with whatever test server you use than to explain mine. test.json test.script.txt unitest.k.txt