abo-abo / swiper

Ivy - a generic completion frontend for Emacs, Swiper - isearch with an overview, and more. Oh, man!
https://oremacs.com/swiper/
2.27k stars 337 forks source link

Async processes use a pty instead of a pipe #1822

Open basil-conto opened 5 years ago

basil-conto commented 5 years ago

Commit 9414f7ab9b479cc6a66539b8f1494c81734d9acf made me realise that asynchronous Counsel processes like counsel-ag and counsel-rg use a pty, rather than the more efficient pipe (see (elisp) Asynchronous Processes and #1759). I don't think any Counsel functionality relies on pty features, so I think we should work towards switching to pipes wholesale.

I tried the following change on top of PR #1821:

diff --git a/counsel.el b/counsel.el
index ff51c6a..ef72895 100644
--- a/counsel.el
+++ b/counsel.el
@@ -148,7 +148,8 @@ counsel--async-command
   (setq name (or name " *counsel*"))
   (when (get-buffer name)
     (kill-buffer name))
-  (let* ((buf (get-buffer-create name))
+  (let* ((process-connection-type nil)
+         (buf (get-buffer-create name))
          (proc (if (listp cmd)
                    (apply #'start-file-process name buf cmd)
                  (start-file-process-shell-command name buf cmd))))

This decreased the time it takes for counsel-rg to find 47690 occurences of cons in the Emacs source tree from ~20s to under 2s. Unfortunately, it also completely breaks many other users of counsel--async-command, such as counsel-ack and counsel-ag. In these latter cases, no results are shown whatsoever.

I'm not sure, but my gut instinct tells me this may even be an Emacs pitfall/bug pertaining to while-no-input and/or pselect(2). This is because various intricacies with asynchronous process output have been reported over the last year, e.g.:

What I'm even less sure about is why counsel-ag et al. suffer from this, but counsel-rg doesn't.

I thought I would open this issue to see if anyone else notices the same behaviour, to discuss switching to pipes in general, and perhaps even to spark some interest in the underlying bug(s).

The example above was tested with latest Ivy and Emacs:

In GNU Emacs 27.0.50 (build 62, x86_64-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2018-11-24 built on thunk
Repository revision: 3aa22e6ec615c5dadb134f1e45ee9bb3034518b7
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12003000
System Description: Debian GNU/Linux buster/sid

Configured using:
 'configure --config-cache --prefix=/home/blc/.local --with-mailutils
 --with-x-toolkit=lucid --with-modules --with-file-notification=yes
 --with-x 'CC=ccache gcc' 'CFLAGS=-O2 -march=native -pipe''

Configured features:
XAW3D XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS
GLIB NOTIFY INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT
LIBOTF XFT ZLIB TOOLKIT_SCROLL_BARS LUCID X11 XDBE XIM MODULES THREADS
LIBSYSTEMD JSON LCMS2 GMP

The precise steps I took were:

  1. Apply the patch above on top of #1821.
  2. make compile
  3. emacs -Q -l colir.elc -l ivy-overlay.elc -l ivy.elc -l swiper.elc -l counsel.elc -l targets/plain.el
  4. M-:(cd source-directory)RET
  5. C-ckcons
  6. Time how long it takes to collect all 47690 results
abo-abo commented 5 years ago

Thanks for the info. I can reproduce the 10x speedup for ripgrep.

I've done some debugging. For the above use case:

Basically with process-connection-type set to nil, it takes less than 0.5 seconds to insert the whole 3 megabytes of ripgrep output into an Emacs buffer, so that ivy--set-candidates code is called only once.

By the way, that branch of counsel--async-filter is pretty inefficient: every time new input comes in (which is 37 times in this case), all old info is discarded and the whole buffer is re-parsed with split-string etc.

A significant speedup would be to call split-string only on the newly inserted portion of the buffer, i.e. str, and appending the result to ivy--all-candidates.

An even more significant speedup would be to rewrite the logic so that ivy--all-candidates could be either a list of strings or a buffer.

basil-conto commented 5 years ago

if process-connection-type is nil, the numbers are 966 and 0.

For counsel-rg, I get between 900-1200 and 1. For counsel-ag, I get 0 and 0.

Basically with process-connection-type set to nil, it takes less than 0.5 seconds to insert the whole 3 megabytes of ripgrep output into an Emacs buffer, so that ivy--set-candidates code is called only once.

Okay, but counsel--async-filter doesn't get called at all for counsel-ag, even after setting counsel-async-filter-update-time to 0.

A significant speedup would be to call split-string only on the newly inserted portion of the buffer, i.e. str, and appending the result to ivy--all-candidates.

Sure, so long as care is taken when joining potentially incomplete lines (process filters can receive input of arbitrary size).

An even more significant speedup would be to rewrite the logic so that ivy--all-candidates could be either a list of strings or a buffer.

Either way, I think it's more important to fix the counsel-ag pipe problem first, before considering any further optimisations.

basil-conto commented 5 years ago

Either way, I think it's more important to fix the counsel-ag pipe problem first, before considering any further optimisations.

Looks like it's a bug with either Emacs or ag:

  1. make plain
  2. (setq lexical-binding t)

    C-j

  3. (defun my-ag (&rest args)
     (let ((p (make-process :name "*my-ag*"
                            :buffer "*my-ag*"
                            :command (cons "ag" args)
                            :connection-type 'pipe)))
       (add-function :after (process-sentinel p)
                     (lambda (p _)
                       (when (eq (process-status p) 'exit)
                         (display-buffer (process-buffer p))))))
     nil)

    C-j

  4. (my-ag "counsel--async-command")

    C-j

    Inspecting M-xlist-processesRET reveals that this process hangs without producing any output.

  5. C-xk\*my-ag\*RETyesRET
  6. (my-ag "counsel--async-command" (expand-file-name default-directory)
    ;; OR
    (my-ag "counsel--async-command" ".")

    C-j

    This time the process runs and exits successfully, but the output lacks file names and line numbers. What's worse, no combination of ag switches that I have tried seems to include this information. I have tried various combinations of --nogroup, --filename, --nopager, --numbers, and --nomultiline, to no avail.

basil-conto commented 5 years ago

This time the process runs and exits successfully, but the output lacks file names and line numbers. What's worse, no combination of ag switches that I have tried seems to include this information. I have tried various combinations of --nogroup, --filename, --nopager, --numbers, and --nomultiline, to no avail.

Ha, (my-ag "--vimgrep" "counsel--async-command" ".") seems to do the job. :)

(See #1800 for an issue with --vimgrep, though.)

basil-conto commented 5 years ago

Ha, (my-ag "--vimgrep" "counsel--async-command" ".") seems to do the job. :)

This and various comments on the silver searcher's issue tracker have convinced me that this is a bug in ag, not Emacs/Counsel.

Alexander-Shukaev commented 4 years ago

I think it's worth to apply the change proposed by @basil-conto as all issues associated with it seem to be resolved.

CeleritasCelery commented 4 years ago

FWIW, I tried the change proposed in the original message and it broke ripgrep for me on linux. But ag worked fine and was much faster under the same test. 🤷‍♀