Closed maddenp closed 2 years ago
@maddenp thanks for the report, what is happening is that the client (lsp-mode) should send the initialize
request with processId
as a int or null, from your client logs, it seems it's null which is ok, but from server side logs, it's trying to parse a object ({}
) which is weird.
I'm not sure if we should spend that much time since we removed lsp4j integration on master and it should be available on next release, so it'd be nice if you could test with a nightly clojure-lsp build, from here, WDYT?
c/c @mainej in case we have the same issue on lsp4clj even with lsp4j removal
Hi @ericdallo, thanks for the fast response! I can confirm replacing the emacs-downloaded clojure-lsp
executable with the one from https://github.com/clojure-lsp/clojure-lsp-dev-builds/releases/tag/2022.08.16-19.32.20-nightly immediately solved the problem for me.
Hmm, a few seconds after starting, though, I do get e.g.
Server clojure-lsp:28782 exited with status exit(check corresponding stderr buffer for details). Do you want to restart it? (y or n)
which repeats if I answer y
. I don't see anything interesting in the lsp
-related buffers, and in the stderr
log just
Process clojure-lsp stderr finished
Anything I could try/check that might be helpful to you? If not, I'm happy to wait for a proper release.
The release will be eventually just a promotion of a nightly to a release, so if your issue was not fixed on master/nightly, it won't be fixed on next release 😅
That kind of crash is bad, but we should have some log regarding that somewhere
From the end of /tmp/clojure-lsp.*.out
:
2022-08-16T21:58:14.459Z INFO [clojure-lsp.server:38] - Liveness probe - Parent process {} is not running - exiting server
2022-08-16T21:58:14.459Z INFO [clojure-lsp.server:409] - Exiting...
Interesting?
It looks like both builds ended up parsing processId
as an empty JSON object {}
. The "old" version of lsp4clj, which uses lsp4j to parse JSON failed with Expected an int but was BEGIN_OBJECT
. And the "new" version of lsp4clj, which uses cheshire to parse JSON failed with Parent process {} is not running
where {}
is usually an integer.
That contradicts the client-side tracing, which says that emacs-lsp sent "processId": null
. So I can think of two possibilities.
First, the problem could be on the clojure-lsp side, the server side. Perhaps it is receiving null
but turning it into {}
somewhere, instead of into nil
. I don't see where this could be happening, but I won't rule it out.
Second, and what I think is more likely, is that emacs-lsp is using two different libraries (or code paths), one to pretty print the traces, and another to serialize the object and send it to clojure-lsp. The library that does the second part could be serializing the (ELisp) nil
into (JSON) {}
instead of null
.
We could try to verify that with a build of clojure-lsp
that turns on server-side tracing. That would add traces to the server logs too, although they would be traces not of the JSON the server gets on the wire, but of the object it has immediately after cheshire has parsed the JSON. Assuming those traces do suggest clojure-lsp is receiving "processId": {}
, I'm not sure what we would do with that information. We could certainly make the code that handles :process-id
more defensive, but if all nulls are converted to objects, we'll have problems all over the place. So I think all it would do is rule out clojure-lsp as the problem and shift the blame over to emacs-lsp, or whatever it is in your emacs environment that's doing JSON serialization.
Sorry—this isn't a very satisfying answer. Would you like to try turning on server-side tracing @maddenp? Have you ever built clojure-lsp before or would you want guidance?
Hi @mainej, and thank you. I followed the instructions at https://clojure-lsp.io/building/ and was able to make debug-cli
, but the resulting clojure-lsp
doesn't seem to offer anything extra in the server log, to my untrained eye, at least. Should I build from a branch other than master
, or follow a different procedure to build an executable with tracing enabled? I agree that it would be worth exonerating clojure-lsp
.
Yes, it might indicate a bug on lsp4clj recent changes where we stopped using lsp4j @mainej, maybe we are still probing even if client didn't send a processId?
Sorry @maddenp, I should have explained... you have to edit this line, changing :trace? false
to :trace? true
. That will turn on server-side tracing. Otherwise, you followed the right procedure. When you have a chance to try again, let me know how it goes.
@ericdallo it looks to me like the client is sending a processId, but it's sending a map instead of null or an integer. By my reading of the traces and logs, that's happening both before and after the removal of lsp4j. The old version blows up a little earlier—the Java JSON parser explodes when it finds a map where it was expecting to find a null or an integer. It's type checking the request object. In the new version the Cheshire JSON parser doesn't do any type checking. If it sees a JSON map, it parses it into a Clojure map, and passes that to the rest of the code. Later on we try to probe for a running process whose id is {}
(assuming that any non-null value is an integer) and things go bad. So, same problem, different symptoms. We'll verify whether the JSON has a map in it when @maddenp has a chance to trace what the server sees as input.
@mainej That worked, thanks. I'm guessing this is the block from /tmp/clojure-lsp.*
you're looking for, but please let me know if you need more / different:
2022-08-17T20:52:25.800Z INFO [clojure-lsp.server:523] - [SERVER] Starting server...
2022-08-17T20:52:27.343Z INFO [clojure-lsp.nrepl:18] - ====== LSP nrepl server started on port 46723
2022-08-17T20:52:27.370Z INFO [clojure-lsp.server:415] - Initializing...
2022-08-17T20:52:27.371Z DEBUG [clojure-lsp.server:38] - [Trace - 2022-08-17T20:52:27.362Z] Received request 'initialize - (1)'
Params: {
"process-id" : { },
"root-path" : "/home/pmadden/git/gencfn",
"client-info" : {
"name" : "emacs",
"version" : "GNU Emacs 28.1 (build 1, x86_64-pc-linux-gnu, X toolkit, Xaw scroll bars)\n of 2022-08-09"
},
...
By way of comparison, I copied this clojure-lsp
executable to the system where I'm NOT having this problem, and the corresponding block looks like
2022-08-17T20:56:13.505Z INFO [clojure-lsp.server:523] - [SERVER] Starting server...
2022-08-17T20:56:14.881Z INFO [clojure-lsp.nrepl:18] - ====== LSP nrepl server started on port 44871
2022-08-17T20:56:14.939Z INFO [clojure-lsp.server:415] - Initializing...
2022-08-17T20:56:14.940Z DEBUG [clojure-lsp.server:38] - [Trace - 2022-08-17T20:56:14.905Z] Received request 'initialize - (1)'
Params: {
"process-id" : null,
"root-path" : "/users/PYS0343/pmadden/git/gencfn",
"client-info" : {
"name" : "emacs",
"version" : "GNU Emacs 28.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.22.30, cairo version 1.15.12)\n of 2022-07-23"
},
...
Am I right in thinking that this supports your hypothesis that it's the client that's sending different data to clojure-lsp
on the different systems and triggering this problem? I wonder if this is down to something emacs is linking against.
I checked the dynamic libraries required by emacs
on the correctly functioning system and found
% readelf -d $(which emacs) | grep NEEDED
...
0x0000000000000001 (NEEDED) Shared library: [libjansson.so.4]
...
Suspecting the usual free-software pun, I checked and, indeed, this is a JSON library. On my broken system my emacs
executable did not link to this library, so I installed libjansson-dev
from my OS package manager and rebuilt emacs
with the --with-json
flag, which is described in ./configure --help
as
--with-json compile with native JSON support
With this new build, clojure-lsp
starts correctly and does not prematurely exit.
@maddenp nice debugging!
@ericdallo you'd have a better sense–does this belong in the troubleshooting docs for lsp-mode?
Nice debugging indeed! @mainej yes, definitely something to be on lsp-mode torubleshooting, which can happen with any server, if we can't fix it on lsp-mode somehow we could try to fix it on server side but wouldn't be an ideal solution.
@maddenp Could you open a issue on lsp-mode linking this one and explaining your findings regarding the jansson lib? Or a PR if you want to improve troubleshooting.
I'm closing this as it doesn't seem to be a issue to be fixed on server side for now, thanks for the help @mainej
Yes, I'll open an issue on lsp-mode
. Thanks for your help with the current issue.
What appears to be essentially the same issue was already reported in https://github.com/emacs-lsp/lsp-mode/issues/3629. The issue was closed but there doesn't seem to be a definitive solution apart from installing libjansson-dev
. I'll link that issue to this one in case it's helpful to anyone finding it. I'm afraid what I found here doesn't add anything new that would warrant opening a new issue, but please let me know if you disagree.
Describe the bug When I open a
.clj
file,clojure-lsp
tries to start but continually says e.g.LSP[clojure-lsp:27231/starting]
in the modeline, and*clojure-lsp::stderr*
showsTo Reproduce I'm not sure. I have three systems, two on which this issue reliably occurs, and one on which I cannot make it happen at all. All are using emacs 28.1 that I have built myself, all with identical
init.el
files, all usingclojure-lsp
executables with identical versions (downloaded automatically by emacs, not separately installed -- though I've also tried pointing to a manually-downloadedclojure-lsp
, to no avail, on the problematic systems).I made some progress using
lsp-start-plain.el
from https://emacs-lsp.github.io/lsp-mode/page/troubleshooting/:clojure-lsp
started correctly then, so I started with a blankinit.el
and started building myinit.el
back up, adding a little more, restarting emacs and confirming thatclojure-lsp
still worked correctly, adding more, testing, etc., until myinit.el
had everything back that I wanted. But then I wanted to be sure I was on solid ground, so removed everything in my~/.emacs.d
except forinit.el
, started emacs, let all the packages download/compile, loaded a.clj
file, and the stuck-at-starting
issue was back. I wondered if different (and maybe incompatible) packages were being installed when emacs started as a blank slate vs when I added packages a couple/few at a time, but the list ofinstalled
packages shown bypackage-list-packages
is identical between my problematic system and the one that works fine.Maybe this isn't even a bug, or not a
clojure-lsp
bug, bug I'm at a loss as to what to look at next, or the meaning of the stacktrace shown above, and would be grateful for any advice. It was so nice for the day that I hadclojure-lsp
working! I'm happy to provide any needed follow-up information.Expected behavior
clojure-lsp
starts correctly.Log - client <-> server
Log - clojure-lsp
User details (please complete the following information):