Open Alizter opened 1 year ago
coq-lsp
server lifetime follows the LSP specification, it is the client who controls the lifetime; do you have a more concrete example?
Open T2048.v and then close the buffer. Coq lsp still runs for a bit.
Ok will try; actually the most likely problem is that T2048.v
gets deep into the kernel to check the Qed
, such deep conversion loops don't do check_for_interrupt
often so there you have your problem: the client sends the shutdown message properly, however the queue is not processed.
So if my guess is correct, the problem is more general, and the issue could be retitled as "Coq doesn't call check_for_interrupt
often enough, and coq-lsp becomes unresponsive", and that's a kind of upstream issue.
Recall that at first I was skeptical that check_for_interrupt
would work well enough, but actually it has!
We've been discussing with OCaml devs how to do this better (and have a check on allocation), but the tradeoffs here are complex.
I think for now we just need to try to gather all the cases where coq-lsp gets stuck, maybe just an extra check in Coq could solve this without an invasive signal-handling patch.
So if my guess is correct, the problem is more general, and the issue could be retitled as "Coq doesn't call check_for_interrupt often enough, and coq-lsp becomes unresponsive", and that's a kind of upstream issue.
Yes, that is a better title. Updated.
Recall that at first I was skeptical that check_for_interrupt would work well enough, but actually it has!
This is good news. Let us try to make better use of it then.
We've been discussing with OCaml devs how to do this better (and have a check on allocation), but the tradeoffs here are complex.
Do you mean: https://github.com/ocaml/ocaml/issues/11411?
I think for now we just need to try to gather all the cases where coq-lsp gets stuck, maybe just an extra check in Coq could solve this without an invasive signal-handling patch.
Let's definitely try that first.
So indeed after the wait you have confirmed that coq-lsp is shutdown normally right?
Do you mean: https://github.com/ocaml/ocaml/issues/11411?
Yes, that was the start of the discussion.
You could use the profiler to see where the inner loop is here, and patch with a check_for_interrupt. But beware the perf hit maybe be large.
I did some more research on this example, the problem seems at:
#3956 0x000056489b651ac7 in camlDeclare__to_constr_body_6449 () at vendor/coq/vernac/declare.ml:1623
#3957 0x000056489b651bf7 in camlDeclare__fun_10409 () at vendor/coq/vernac/declare.ml:1648
Note the stack trace at almost 4000 deep! Another bad spot:
#5604 0x0000563d779cf526 in camlTypeops__execute_2721 () at vendor/coq/kernel/typeops.ml:615
#5605 0x0000563d779d11f3 in camlTypeops__infer_2985 () at vendor/coq/kernel/typeops.ml:806
At one point I saw this at more than 8000 depth!
So indeed the Qed
here is very expensive and a few operations take most of the time; not sure we can do better with the current tech without incurring in an slowdown as these Qed
paths are really sensitive.
cc #137
Coq-lsp should terminate when the buffer is closed. Otherwise it stays open and has to be restarted manually.