emacs-lsp / lsp-mode

Emacs client/library for the Language Server Protocol
https://emacs-lsp.github.io/lsp-mode
GNU General Public License v3.0
4.73k stars 864 forks source link

lsp-mode hangs while "starting" over tramp without lsp-print-io #2709

Open shanemcd opened 3 years ago

shanemcd commented 3 years ago

Hello. I have been debugging hanging / performance issues with lsp-mode when I stumbled upon: https://emacs-lsp.github.io/lsp-mode/page/performance/#check-if-logging-is-switched-off

I tried to do this, but now lsp-mode does not start at all. Adding back (setq lsp-log-io t) solves the "starting" issue, but with that I start to see emacs freezing after using it for some time.

I found https://github.com/emacs-lsp/lsp-mode/issues/1845, which indicated that updating tramp helped, but I am still seeing the issue after updating to tramp 2.5.0.2.

To Reproduce I am able to reproduce this when using any Python project over Tramp. Here is my complete lsp-mode / lsp-ui config:

(use-package lsp-mode
  :init (setq lsp-keymap-prefix "s-l")
  :hook ((python-mode . lsp))
  :commands (lsp lsp-deferred)
  :config
  ;; (setq lsp-log-io t)
  (setq lsp-pyls-configuration-sources ["flake8"])
  (setq lsp-enable-file-watchers nil)
  (setq lsp-pyls-plugins-mccabe-enabled nil)
  (setq gc-cons-threshold 100000000)
  (setq read-process-output-max (* 1024 1024)) ;; 1mb
  (setq lsp-diagnostic-package :none)
  (lsp-register-client
    (make-lsp-client :new-connection (lsp-tramp-connection "pyls")
                     :major-modes '(python-mode)
                     :remote? t
                     :server-id 'pyls-remote)))

(use-package lsp-ui
  :requires lsp-mode flycheck
  :commands lsp-ui-mode
  :custom-face
  (lsp-ui-doc-background ((t (:background nil))))
  (lsp-ui-doc-header ((t (:inherit (font-lock-string-face italic)))))
  :hook (lsp-mode-hook . lsp-ui-mode)
  :custom
  (lsp-ui-doc-enable t)
  (lsp-ui-doc-use-childframe f)
  (lsp-ui-flycheck-enable t)
  (lsp-ui-flycheck-list-position 'right)
  (lsp-ui-flycheck-live-reporting t)
  (lsp-ui-peek-enable t)
  (lsp-ui-peek-list-width 60)
  (lsp-ui-peek-peek-height 25)
  (lsp-ui-doc-header t)
  (lsp-ui-doc-include-signature t)
  (lsp-ui-doc-position 'top)
  (lsp-ui-doc-border (face-foreground 'default))
  (lsp-ui-sideline-enable t)
  (lsp-ui-sideline-ignore-duplicate t)
  (lsp-ui-sideline-show-code-actions t)
  :bind (:map lsp-ui-mode-map
         ([remap xref-find-definitions] . lsp-ui-peek-find-definitions)
         ([remap xref-find-references] . lsp-ui-peek-find-references)))

OS macOS 11.0.1 using emacs 27.1 installed via brew install --cask emacs.

Using /sshx:fedora:/home/shanemcd/ into a Fedora 33 VM.

hpdeifel commented 3 years ago

I also have this on Fedora 33 when tramping into a debian docker container.

My current workaround is to set *lsp-log-io* to t while starting and set it back to nil after some time so the session stays responsive.

ordicker commented 3 years ago

Got the same thing with clang and linux [host-ubuntu remote-ubuntu(nvidia xavier)]

zilti commented 3 years ago

I have the same issue... *lsp-log-io* doesn't help. Where would that log the io to anyway? Maybe something in that log could help.

ordicker commented 3 years ago

That's my lsp-log Command "clangd-10" is present on the path. Command "clangd-10" is present on the path. Found the following clients for /ssh:XXXXX@XXXXXXXX:/home/XXXX/workspace/pipeline/src/blur_exp.cpp: (server-id clangd-remote, priority 0) The following clients were selected based on priority: (server-id clangd-remote, priority 0) I don't find it extremely helpful :disappointed: ,but again when you turn the login on the problem is solved.

zilti commented 3 years ago

Ah, it seems my mistake was to manually (setq *lsp-log-io*) in a scratch buffer. Adding it in my init file very high up helped. What an odd thing... Probably a timing issue, aka a Heisenbug?

yyoncho commented 3 years ago

I suspect the issue is that process-send-string is re-entrant and adding the logging slows down lsp-mode. Adding print statements in the beginning/ending of lsp--send-no-wait will help proving that.

zilti commented 3 years ago

Ah no, the *lsp-log-io* thing is not even good enough as a workaround. Even with it enabled, it manages to get past "starting..." once in about every five times, and usually after having to wait for a few minutes. And if it manages to get from starting to started, it locks up Emacs 75% of the time.

md-arif-shaikh commented 3 years ago

Experiencing the same issue where lsp-mode over tramp freezes emacs after few seconds.

zilti commented 3 years ago

It seems the freezing is "intentional", it freezes until it has analyzed the folder contents. For me it freezes Emacs for about one minute, before asking me if I want it to watch all files in my project.

yyoncho commented 3 years ago

@zilti most likely file watches over tramp are slow. I suggest turning them off by setting lsp-enable-file-watchers to nil

zilti commented 3 years ago

Yes, I assume that's the issue. I'm fine with it, there's worse things to happen than to have to wait a minute at the beginning of a work day before starting to code ^^ just wanted to explain the freezing. But turning it off is probably a good idea for most people.

iostapyshyn commented 3 years ago

I'm experiencing the same problem with clangd on emacs upstream 28.0.50. It does start occasionally and (setq lsp-log-io t) only helps to some extent. SSHing into a debian VM

clangd stderr is:

I[22:21:30.416] Debian clangd version 11.0.1-2~bpo10+1
I[22:21:30.416] PID: 17650
I[22:21:30.416] Working directory: /home/user/linux
I[22:21:30.416] argv[0]: clangd
I[22:21:30.416] argv[1]: --header-insertion-decorators=0
I[22:21:30.417] Starting LSP over stdin/stdout
E[22:21:30.443] JSON parse error: [3:2323, byte=2325]: Expected , or } after object property
antifuchs commented 3 years ago

I've been trying to use a rust-analyzer LSP with lsp-tramp-connection for a few weeks now, and never went anywhere, so I decided to debug it. The problem seems to be that LSP tries to use stdio communication over a channel that is inherently crippled - ssh stdio with some semi-terminal emulation in the mix.

That starts with the stdout/stderr confusion (which lsp tries to work around by redirecting to a file), continues on to TTY problems and \n->\r translation (worked around with stty raw, which has its own stdout/stderr problems...) and then continues to weird buffering issues.

So, the solution for me ended up being to not use lsp-tramp-connection at all and use ssh's portforwarding instead.

Here's the connection definition that I'm using now for rust-analyzer, which works almost like a local connection (and is much faster than local, since the remote is a far more powerful machine than my laptop is):

(defun lsp-tramp-connection-over-ssh-port-forwarding (command)
  "Like lsp-tcp-connection, but uses SSH portforwarding."
  (list
   :connect (lambda (filter sentinel name environment-fn)
              (let* ((host "localhost")
                     (lsp-port (lsp--find-available-port host (cl-incf lsp--tcp-port)))
                     (command (with-parsed-tramp-file-name buffer-file-name nil
                                (message "[tcp/ssh hack] running LSP %s on %s / %s" command host localname)
                                (let* ((unix-socket (format "/tmp/lsp-ssh-portforward-%s.sock" lsp-port))
                                       (command (list
                                                 "ssh"
                                                 ;; "-vvv"
                                                 "-L" (format "%s:%s" lsp-port unix-socket)
                                                 host
                                                 "socat"
                                                 (format "unix-listen:%s" unix-socket)
                                                 (format "system:'\"cd %s && %s\"'" (file-name-directory localname) command)
                                                 )))
                                  (message "using local command %s" command)
                                  command)))
                     (final-command (if (consp command) command (list command)))
                     (_ (unless (executable-find (cl-first final-command))
                          (user-error (format "Couldn't find executable %s" (cl-first final-command)))))
                     (process-environment
                      (lsp--compute-process-environment environment-fn))
                     (proc (make-process :name name :connection-type 'pipe :coding 'no-conversion
                                         :command final-command :sentinel sentinel :stderr (format "*%s::stderr*" name) :noquery t))
                     (tcp-proc (progn
                                 (sleep-for 1) ; prevent a connection before SSH has run socat. Ugh.
                                 (lsp--open-network-stream host lsp-port (concat name "::tcp")))))

                ;; TODO: Same :noquery issue (see above)
                (set-process-query-on-exit-flag proc nil)
                (set-process-query-on-exit-flag tcp-proc nil)
                (set-process-filter tcp-proc filter)
                (cons tcp-proc proc)))
   :test? (lambda () t)))

(lsp-register-client
 (make-lsp-client :new-connection (lsp-tramp-connection-over-ssh-port-forwarding "rust-analyzer")
                  :major-modes '(rust-mode)
                  :initialization-options '((omitInitBuild . t)
                                            (cmdRun . t))
                  :notification-handlers (ht ("window/progress" 'lsp-clients--rust-window-progress))
                  :action-handlers (ht ("rls.run" 'lsp-rust--rls-run))
                  :remote? t
                  :server-id 'rust-analyzer-remote))

This code uses socat on the remote end to set up SSH forwarding such that the remote end is still an stdio-based language server (as that's the only way rust-analyzer runs at the moment), and connects through a unix domain socket (just so the remote connection doesn't occupy a potentially-taken local port). You could adjust this to listen on a TCP port, but this seemed safer to me.

Hope this is useful to somebody else.

One other word of caution though, I have tried setting up the same connection function for the pyright language server, but it stubbornly refuses to send responses back to emacs after opening a file, so something is wonky there still. Given that rust-analyzer works though, I'd say it is pyright-language-server that is having a bad day, not the ssh forwarding setup.

muirdm commented 3 years ago

I also have startup issues over tramp that go away with a sleep. I'm going to assume it is the same issue (but note that hack language server crashes instead of hangs).

I suspect the issue is that process-send-string is re-entrant and adding the logging slows down lsp-mode. Adding print statements in the beginning/ending of lsp--send-no-wait will help proving that.

I instrumented lsp--send-no-wait but it was not re-entrant. In fact, there was only a single call (the initialization message).

I changed my server command to start under strace so I could see exactly what the server was getting:

1926880 <... read resumed>"Content-Length: 2337\n\n\n\n{\"jsonrpc\":\"2.0\",\"method\":\"initialize\",\"params\":{\"processId\":null,\"rootPath\":\"XXX\",\"clientInfo\":{\"name\":\"emacs\",\"version\":\"GNUEmacs 28.0.50 (build 1, x86_64-apple-darwin20.5.0, NS appkit-2022.50 Version 11.4 (Build 20F71))\\n of 2021-06-15\"},\"rootUri\":\"file:///XXX\",\"capabilities\":{\"workspace\":{\"workspaceEdit\":{\"documentChanges\":true,\"resourceOperations\":[\"create\",\"rename\",\"delete\"]},\"applyEdit\":true,\"symbol\":{\"symbolKind\":{\"valueSet\":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]}},\"executeCommand\":{\"dynamicRegistration\":false},\"workspaceFolders\":true,\"configuration\":true},\"textDocument\":{\"declaration\":{\"linkSupport\":true},\"definition\":{\"linkSupport\":true},\"implementation\":{\"linkSupport\":true},\"typeDefinition\":{\"linkSupport\":true},\"synchronization\":{\"willSave\":true,\"didSave\":true,\"willSaveWaitUntil\":true},\"documentSymbol\":{\"symbolKind\":{\"valueSet\":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]},\"hierarchicalDocumentSymbolSupport\":true},\"formatting\":{\"dynamicRegistration\":true},\"rangeFormatting\":{\"dynamicRegistration\":true},\"rename\":{\"dynamicRegistration\":true,\"prepareSupport\":true},\"codeAction\":{\"dynamicRegistration\":true,\"isPreferredSupport\":true,\"codeActionLiteralSupport\":{\"codeActionKind\":{\"valueSet\":[\"\",\"quickfix\",\"refactor\",\"refactor.extract\",\"refactor.inline\",\"refactor.rewrite\",\"source\",\"source.organizeImports\"]}},\"resolveSupport\":{\"properties\":[\"edit\",\"command\"]},\"dataSupport\":true},\"completion\":{\"completionItem\":{\"snippetSupport\":true,\"documentationFormat\":[\"markdown\",\"plaintext\"],\"resolveAdditionalTextEditsSupport\":true,\"insertReplaceSupport\":true,\"resolveSupport\":{\"properties\":[\"documentation\",\"details\",\"additionalTextEdits\",\"command\"]},\"insertTextModeSupport\":{\"valueSet\":[1,2]}},\"contextSupport\":true},\"signatureHelp\":{\"signatureInformation\":{\"parameterInformation\":{\"labelOffsetSupport\":true}}},\"documentLink\":{\"dynamicRegistration\":true,\"tooltipSupport\":true},\"hover\":{\"contentFormat\":[\"markdown\",\"plaintext\"]},\"foldingRange\":{\"dynamicRegistration\":true},\"callHierarchy\":{\"dynamicRegistration\":false},\"publishDiagnostics\":{\"relatedInformation\":true,\"tagSupport\":{\"valueSet\":[1,2]},\"versionSupport\":true}},\"window\":{\"workDoneProgress\":true}},\"initializationOptions\":null,\"workDoneToken\":\"1\"},\"id\":1062}\n", 65536) = 2361

Notice the \n\n\n\n instead of \r\n\r\n after header. When I add a sleep before we send the first message the server gets \r\n\r\n properly.

I'm running Emacs 28.0.50 I built this week.

muirdm commented 3 years ago

I think we can fix it by echoing a string after the stty and waiting for it (so we can be sure the stty has finished before we start writing stuff to the process). Something like:

  diff --git a/lsp-mode.el b/lsp-mode.el
index e40775764..75e2ace99 100644
--- a/lsp-mode.el
+++ b/lsp-mode.el
@@ -6878,7 +6882,7 @@ returns the command to execute."
                           (process-name (generate-new-buffer-name name))
                           (wrapped-command (s-join
                                             " "
-                                            (append '("stty" "raw" ";")
+                                            (append '("stty" "raw" ";" "echo" "ready" ";")
                                                     final-command
                                                     (list
                                                      (concat "2>"
@@ -6891,6 +6895,14 @@ returns the command to execute."
                      (let ((proc (start-file-process-shell-command process-name
                                                                    (format "*%s*" process-name)
                                                                    wrapped-command)))
+
+                       (set-process-filter proc (lambda (proc input)
+                                                  (when (not (string= input "ready"))
+                                                    (lsp-warn "Unexpected startup message: %s" input))))
+
+                       (unless (accept-process-output proc 5)
+                         (lsp-warn "Didn't receive ready message on startup."))
+
                        (set-process-sentinel proc sentinel)
                        (set-process-filter proc filter)
                        (set-process-query-on-exit-flag proc nil)
yyoncho commented 3 years ago

@muirdm there is also this - https://github.com/emacs-lsp/lsp-mode/pull/2531

Have you tested with latest version of tramp? Although this proposal might fix the issue I would like to avoid it because it seems more like a hack to avoid tramp bug.

muirdm commented 3 years ago

I'm using tramp 2.5.1-pre. #2531 looks like it should work instead of my patch - I'll try it later. But will people be able to use tramp 2.5 on stable Emacs?

yyoncho commented 3 years ago

But will people be able to use tramp 2.5 on stable Emacs?

I think that it is available via elpa.

muirdm commented 3 years ago

Apart from some minor bugs, #2531 seems to work great and gives a much better experience overall.

zilti commented 2 years ago

Any updates on this?

sstepashka commented 2 years ago

Any updates?

manuel-arguelles commented 2 years ago

In my case it starts clangd-12 in the container and starts indexing just file, but emacs is locked... cpu usage is not at 100% but it's totally unresponsive. (emacs 27.1) tramp 2.5

Blade6570 commented 2 years ago

Any updates on it? One year has passed and not able to find any other discussion on it.

KJordanBerg commented 2 years ago

I'm also interested to know if there are any updates. I'd really like to use LSP-mode over Tramp.

TMueller83 commented 2 years ago

I'm affected by the same problem too. Would be happy about a fix.

b-spencer commented 2 years ago

FWIW, I managed to catch this in a locally rebuilt version of the stock Debian 10 backport of emacs-lucid 27.1 with the debugger. I am not familiar with emacs internals, but I was able to break execution at many places and it looked like it was evaluating Lisp over and over again forever. In other words, if the C core of Emacs was in an infinite loop, it sure wasn't a tight one. It felt like it could have been an infinite loop in Lisp, or perhaps an unending series of Lisp events to process. At one point I caught it deep in the garbage collector, but the rest of the time it was churning through byte code at many layers of C.

albusshin commented 1 year ago

I've been using @antifuchs 's code above until recently I upgraded all the packages in my doom emacs, and it stopped working with the error of wrong-number-of-arguments.

This was due to the number of argument changed in commit 59eeba73a6e1064d10efefa9a8ba0de20c938591. The following modified version should work:

(defun lsp-tramp-connection-over-ssh-port-forwarding (command)
  "Like lsp-tcp-connection, but uses SSH portforwarding."
  (list
   :connect (lambda (filter sentinel name environment-fn _workspace)
              (let* ((host "localhost")
                     (lsp-port (lsp--find-available-port host (cl-incf lsp--tcp-port)))
                     (command (with-parsed-tramp-file-name buffer-file-name nil
                                (message "[tcp/ssh hack] running LSP %s on %s / %s" command host localname)
                                (let* ((unix-socket (format "/tmp/lsp-ssh-portforward-%s.sock" lsp-port))
                                       (command (list
                                                 "ssh"
                                                 ;; "-vvv"
                                                 "-L" (format "%s:%s" lsp-port unix-socket)
                                                 host
                                                 "socat"
                                                 (format "unix-listen:%s" unix-socket)
                                                 (format "system:'\"cd %s && %s\"'" (file-name-directory localname) command)
                                                 )))
                                  (message "using local command %s" command)
                                  command)))
                     (final-command (if (consp command) command (list command)))
                     (_ (unless (executable-find (cl-first final-command))
                          (user-error (format "Couldn't find executable %s" (cl-first final-command)))))
                     (process-environment
                      (lsp--compute-process-environment environment-fn))
                     (proc (make-process :name name :connection-type 'pipe :coding 'no-conversion
                                         :command final-command :sentinel sentinel :stderr (format "*%s::stderr*" name) :noquery t))
                     (tcp-proc (progn
                                 (sleep-for 1) ; prevent a connection before SSH has run socat. Ugh.
                                 (lsp--open-network-stream host lsp-port (concat name "::tcp")))))

                ;; TODO: Same :noquery issue (see above)
                (set-process-query-on-exit-flag proc nil)
                (set-process-query-on-exit-flag tcp-proc nil)
                (set-process-filter tcp-proc filter)
                (cons tcp-proc proc)))
   :test? (lambda () t)))

(lsp-register-client
 (make-lsp-client :new-connection (lsp-tramp-connection-over-ssh-port-forwarding "rust-analyzer")
                  :major-modes '(rust-mode)
                  :initialization-options '((omitInitBuild . t)
                                            (cmdRun . t))
                  :notification-handlers (ht ("window/progress" 'lsp-clients--rust-window-progress))
                  :action-handlers (ht ("rls.run" 'lsp-rust--rls-run))
                  :remote? t
                  :server-id 'rust-analyzer-remote))
seokbeomKim commented 9 months ago

Hmm.. Is this issue reproduced in Doomemacs only? @albusshin 's workaround works well but this issue was reported over 2 years ago. I would like to know why the workaround is not applied to the mainline.

zilti commented 9 months ago

Because the workaround is not actually working. And at least for me it also happened in Spacemacs.

seokbeomKim commented 9 months ago

@zilti I see .... then I need to find an alternative for working on remote sessions. Thank you anyway.