emacs-lsp / lsp-mode

Emacs client/library for the Language Server Protocol
https://emacs-lsp.github.io/lsp-mode
GNU General Public License v3.0
4.77k stars 883 forks source link

One keystroke lag while waiting for completion #2758

Closed blahgeek closed 3 years ago

blahgeek commented 3 years ago

So I encountered this weird problem with ts-ls server: it seems that sometimes, when emacs is waiting for the completion result, there's always one keystroke lagging. It does not happen every time, only sometimes.

For example:


At first, I'm in this state (| means the cursor): xxx.|. There's no completion popup yet (apparently sometimes it took ts-ls about 10 seconds to get the completion results, but I'm not complaining for that for now) and I think the ts-ls server is processing the request (CPU is 100%).

Then, I type some random key, like a. Nothing happens. It still displays xxx.|. Even after one or two seconds.

Then, I type another key, say b. Instantly, it displays xxx.a|, without b.

Then, I type another key, say c. Instantly, it displays xxx.ab|, without c.

And so on. There's always one keystroke lag. Any keys, including backspace, have the same behavior.

The lag would disappear as soon as the completion popup appears. After that, everything go back to normal.


My setup: emacs native-comp branch in linux. company-mode with lsp's company-capf as suggested.

I'm not sure what other information do you need, please let me know if you need any.

yyoncho commented 3 years ago

To avoid blinking of the popup lsp-mode will return the previously available results if it is unable to retrieve/calculate the new result when the user is typing. This is controlled by lsp-completion-use-last-result.

blahgeek commented 3 years ago

To avoid blinking of the popup lsp-mode will return the previously available results if it is unable to retrieve/calculate the new result when the user is typing. This is controlled by lsp-completion-use-last-result.

I think it should only affect the content in the completion popup? In my case, the last character I type is not shown in the buffer. I don't think that's expected.

yyoncho commented 3 years ago

Got it. Which commit is your emacs built from? Can you test with other versions as well?

blahgeek commented 3 years ago

I'm using emacs from nixpkgs, version 20210312, which I believe is either https://github.com/emacs-mirror/emacs/commit/82bd6d57d54d4cdb205d921c2476d1dbb17f4188 or https://github.com/emacs-mirror/emacs/commit/d018584814e0c15f13bc458ba54491239b584069 (they changed the commit in the same version once so I'm not sure....)

I will test with latest native-comp version and 27.1 version.

blahgeek commented 3 years ago

I'm using emacs from nixpkgs, version 20210312, which I believe is either emacs-mirror/emacs@82bd6d5 or emacs-mirror/emacs@d018584 (they changed the commit in the same version once so I'm not sure....)

I will test with latest native-comp version and 27.1 version.

I can reproduce this issue in both the latest native-comp version and 27.1 version.

yyoncho commented 3 years ago

Are you able to reproduce with lsp-start-plain.el + emacs 27.1? If yes, can you provide a file/project to reproduce the issue with.

yyoncho commented 3 years ago

(with the description from the PR I am unable to reproduce the issue).

blahgeek commented 3 years ago

@yyoncho I'm able to reproduce this with lsp-start-plain.el:

  1. Download lsp-start-plain.el, and add the following two lines:
(add-hook 'after-init-hook 'global-company-mode)
(setq company-idle-delay 0)
  1. open emacs with env HOME=/tmp/test-home emacs -q -l ./lsp-start-plain.el, install ts-ls language server
  2. patch the ts-ls language server to make it slower on completion (to simulate a large project): in .emacs.d/.cache/lsp/npm/typescript-language-server/lib/node_modules/typescript-language-server/lib/lsp-server.js line 401 (in completion(params) function), add line yield (new Promise(resolve => setTimeout(resolve, 10000))); (always delay 10 seconds).
  3. Restart emacs with the same command, open some-new-file.js
  4. type import, wait 10 seconds, the autocompletion popup will apear (as expected)
  5. type 6 backspaces, delete import, slowly start typing some other random word abcdefg, you will see the lag now (e.g. after typing abcdefg, it only displays abcdef
yyoncho commented 3 years ago

Thank you, I will take a look.

blahgeek commented 3 years ago

git bisect shows that this issue is introduced in https://github.com/emacs-lsp/lsp-mode/pull/2483

blahgeek commented 3 years ago

As I dig deeper, I think this may be a company-mode issue: I think it should trigger a redisplay before fetching candidates. If I add (redisplay) at the beginning of company--fetch-candidates, this issue will not appear.


update: the root cause seems to be this:

When there's existing completion candidates, company--continue will be called directly without timer in post-command-hook. Normally when user is typing more characters, company--continue will just filter the existing candidates without requesting the backend; however when the user type backspace and change the prefix, company--continue will make a request to the backend. Note that at this time, redisplay hasn't get called yet, so there's one keystroke lagging in the buffer.

My workaround:

    (defun my/redisplay-if-waiting-too-long (orig-fn &rest args)
      "Advice around lsp-request-while-no-input,
       arm a timer to redisplay during lsp request, when `this-command` is not nil (during post-command-hook).
       The timer will be called inside (input-pending-p)"
      (if (not this-command)
          (apply orig-fn args)
        (let* ((timer (run-with-timer 0.05 nil
                                     (lambda () (let ((inhibit-redisplay nil))
                                             (redisplay)))))
               (res (apply orig-fn args)))
          (cancel-timer timer)
          res)))
    (advice-add 'lsp-request-while-no-input :around #'my/redisplay-if-waiting-too-long)
yyoncho commented 3 years ago

I was suspecting the same. #2483 is fix exposing the issue. @dgutov, any thoughts?

dgutov commented 3 years ago

@yyoncho I don't have the code on my machine, or a reproduction scenario to fiddle with, but AFAIU this stems from the "faux async" approach you use here (also employed by @joaotavora in Eglot), that you wait for the user's next keypress before deciding what to do (show completions or abort), while the completion framework has no idea what's going on. Thus Emacs's reaction is unavoidably delayed by one keypress.

A (redisplay) call might be an improvement, but redisplay is not free: (benchmark 1 '(redisplay)) shows it can add an extra 20-30ms delay, so I wouldn't want to call it unconditionally.

Try adding (when lsp--throw-on-input (redisplay)) before the request is issued here and see if it improves things: https://github.com/emacs-lsp/lsp-mode/blob/65fb3e8d071406c4596dcc13e3f0230e1f730ec6/lsp-completion.el#L418

yyoncho commented 3 years ago

@dgutov makes sense.

@kiennq to me, it seems like placing (redisplay) just after we called the server is better since we will use the time we wait for the server's response to refresh the screen. WDYT?

FWIW redisplay on my side takes less than 1ms.

kiennq commented 3 years ago

@kiennq to me, it seems like placing (redisplay) just after we called the server is better since we will use the time we wait for the server's response to refresh the screen. WDYT?

Yeah, that looks okay to me. Should be a change in lsp-request-* right?

aaronjensen commented 3 years ago

FWIW redisplay on my side takes less than 1ms.

That's amazing. On my machine it takes 28ms w/ my config and anywhere from 3-20ms with emacs -Q

yyoncho commented 3 years ago

Yeah, that looks okay to me. Should be a change in lsp-request-* right?

Do you mean adding it to lsp-request as well? I don't see a valid reason to add it to lsp-request as well but most likely it won't hurt if we do so.

Edit: just for clarity, we need to add it to lsp-request-while-no-input for sure.

yyoncho commented 3 years ago

FWIW redisplay on my side takes less than 1ms.

That's amazing. On my machine it takes 28ms w/ my config and anywhere from 3-20ms with emacs -Q

I believe if you are on 4k monitor(@dgutov is) it will affect redisplay speed(but I am not an expert in the field).

aaronjensen commented 3 years ago

Ah, I'm on a retina macbook, likely two things working against me.

dgutov commented 3 years ago

I believe if you are on 4k monitor(@dgutov is) it will affect redisplay speed(but I am not an expert in the field).

It seems it's affected by whether tool-bar-mode is enabled (so redisplay is slow-ish with emacs -Q) and whether one has a fancy mode-line package installed (smart-mode-line in my case). The window size is more or less irrelevant.

dgutov commented 3 years ago

Edit: just for clarity, we need to add it to lsp-request-while-no-input for sure

It won't affect anything else than completion, right?

yyoncho commented 3 years ago

It seems it's affected by whether tool-bar-mode is enabled (so redisplay is slow-ish with emacs -Q) and whether one has a fancy mode-line package installed (smart-mode-line in my case). The window size is more or less irrelevant.

I don't have toolbar and I am using vanilla modeline since most of the modelines are slow...

dgutov commented 3 years ago

I don't have toolbar and I am using vanilla modeline since most of the modelines are slow...

That seems like a good choice, and we could probably treat slowdown from both as bugs anyway, but until they are fixed across the most of the ecosystem, calling redisplay unnecessarily remains unwise.

kiennq commented 3 years ago

It won't affect anything else than completion, right?

Yes. Well, we're waiting for the server response anyway, and that would be much slower than running redisplay

aaronjensen commented 3 years ago

Yes. Well, we're waiting for the server response anyway, and that would be much slower than running redisplay

It's unclear that this is the case because I believe it's introducing a synchronous delay to every keypress that triggers a completion, effectively. This could have the result of making perceived latency be a bit worse while typing normally.

Also, we want to be sure to trigger the redisplay after making the request to the server, yes? Otherwise it just slows everything down.

kiennq commented 3 years ago

Unless we force it, the redisplay is being preempted by input. So I don't think there will be much delay perceived.

kiennq commented 3 years ago

PR here #2772

joaotavora commented 3 years ago

"faux async" approach you use here (also employed by @joaotavora in Eglot

Faux ou vrai, it's not the same technique, it's used in more than LSP and never suffered from this problem, even with slow servers.

dgutov commented 3 years ago

If someone could evaluate lsp-mode and eglot side-by-side with the same LSP server in this scenario and report here, that would be great. It's possible that there is some subtlety to the latter's implementation which lsp-mode could adopt.

In the meantime, though, I have this report which pretty much equates lsp-mode and eglot's behavior wrt input lag: https://github.com/company-mode/company-mode/issues/1073#issuecomment-802262483

joaotavora commented 3 years ago

which pretty much equates lsp-mode and eglot's behavior wrt input lag: company-mode/company-mode#1073 (comment)

To be clear, I was referring to the "one keystoke lag" as I understood it in this issue. I've never seen that "off-by-one" behaviour, but I've seen of course seen milleseconds of lag. Emacs is single-threaded, and event-driven, much like JS and browsers. It only checks for input events, quits, etc in certain places, so depending on the operation involved (uninterruptible C-level JSON processing?) it could take a while before the user's input in seen. Doesn't seem to be related to the "slowness of the server" (which runs in a different process anyway), but to the amount of processing that Emacs has to do to its output before it notices that it has to drop it on the floor.

there is some subtlety to the latter's implementation which lsp-mode could adopt.

It seems emacs-lsp calls accept-process-output and input-pending-p repeatedly with a 0.001s separation. Seems like a lot of calls and Elisp-level polling, but never tried it so can't speak to its merits. jsonrpc.el just calls sit-for once with 30s which calls C-level Fread_event with with those same 30s timeout, so waiting for events (key or input) is done in C. Subtle or not, quite a different technique. And while-no-input is yet another wholly different technique to accomplish mostly the same.

blahgeek commented 3 years ago

I can confirm that eglot does not have such issue.

jsonrpc.el just calls sit-for once with 30s

According to the doc, sit-for will do redisplay before sleeping.


Edit: sorry, I was wrong about the behavior for eglot. See https://github.com/emacs-lsp/lsp-mode/issues/2758#issuecomment-821749615

dgutov commented 3 years ago

According to the doc, sit-for will do redisplay before sleeping.

Ah, so it's already doing what's being discussed here. Thanks for testing!

It does mean that Eglot also incurs the overhead of unnecessary redisplays (which are not free), but there doesn't seem to be any better solution without "proper" async.

yyoncho commented 3 years ago

@dgutov my understanding was that after each command emacs is running redisplay. If this is not the case(I am still not sure) and redisplay is heavy - do you know what is the lightweight method that runs to repainting emacs after typing a key?

dgutov commented 3 years ago

my understanding was that after each command emacs is running redisplay

It does. Unless there is some pending input.

do you know what is the lightweight method that runs to repainting emacs after typing a key?

Not really, that's just redisplay. ;-)

You end up doing it twice: if sit-for starts with redisplay, then aborts with user input, which results in another redisplay, you've done it twice. But if you don't do it at the beginning, that can be perceived by some users as "sluggish" (or one-keystroke-lag).

I suppose the main case that's disadvantaged is the "fast server" one: first you wait for redisplay, then you receive the response right after and redisplay again. So @blahgeek's suggestion of using a timer makes sense, but it's a balance between perceived lag and overall latency (some data about how quickly different servers respond should help).

If you do it, BTW, try a noop timer first: IIRC, Emacs automatically triggers redisplay whenever a timer fires (though that might require some additional conditions, not 100% sure).

joaotavora commented 3 years ago

I can confirm that eglot does not have such issue. According to the doc, sit-for will do redisplay before sleeping.

Ah, so it's already doing what's being discussed here. Thanks for testing!

FWIW I've changed sly.el which uses the very same technique to pass t as the NODISP parameter to sit-for, and it doesn't seem to make any difference, I don't observe the so-called "one-keystroke-lag" behaviour at all. I'd wager that doing the same in jsonrpc.el would produce similar effects. So I'm skeptical of the conclusions being reached here.

But if you don't do it at the beginning, that can be perceived by some users as "sluggish" (or one-keystroke-lag).

Except this one user here has tried it and doesn't see it, at least I don't see any difference when passing nodisp as t or nil.

(not to mention I don't see a noticeable performance impact of redisplay either but that's maybe because I don't use fancy modelines or such cruft)

Now, when the server is sluggish (i've added a 1 sec wait server side), there is indeed lag on the first keypress while waiting for the reply from the server, the reply that brings completions that reflect that keypress. However, that lag doesn't seem to be related to any kind of 'extra redisplay'. Does company not insert the character into the buffer before requesting the completions? Regardless, this particular behaviour doesn't seem to be related to any kind of 'extra redisplay call' or not.

I'm still investingating though: here's the code I was alluding to, btw: https://github.com/joaotavora/sly/blob/5966d68727898fa6130fb6bb02208f70aa8d5ce3/sly.el#L2420

dgutov commented 3 years ago

Now, when the server is sluggish (i've added a 1 sec wait server side), there is indeed lag on the first keypress while waiting for the reply from the server

That's the issue under discussion (IIUC).

(not to mention I don't see a noticeable performance impact of redisplay either but that's maybe because I don't use fancy modelines or such cruft)

Believe it or not, simply having tool-bar-mode enabled also counts as "fancy modeline".

Regardless, this particular behaviour doesn't seem to be related to any kind of 'extra redisplay call' or not.

Indeed, additional redisplay (which can take up to 30ms, depending on user configuration) isn't going to cause this kind of problem. It can still be unfortunate, making Emacs a tad less responsive than it could be.

Does company not insert the character into the buffer before requesting the completions?

Company doesn't do (or inhibit) any character insertion. You're describing a problem with "missing" redisplay.

joaotavora commented 3 years ago

That's the issue under discussion (IIUC).

Hmmm, I thought it was an odd off-by-one-like thing as described in the original post. I don't see anything like that at all (but I haven't ever seen the original one) And what's more, a user has also reported that the bug is not in Eglot. I wonder if that user (@blahgeek ?) can also try by adding nodisp as t to jsonrpc.el 's source code, recompile it, and see if it somehow exhibits the bug that it apparently not present right now.

Company doesn't do (or inhibit) any character insertion. You're describing a problem with "missing" redisplay.

That doesn't seem to go together with your theory that Eglot and Sly are somehow insulated from this because they're using sit-for which redisplays, then waits. That is because I see exactly the same behaviour regardless of passing nodisp as t or nil to sit-for.

It also doesn't seem to go together with my experiments. I can't make it go away with some naive redisplay insertions. Where exactly is that "missing" redisplay? Where should I hackingly insert it to see the problem go away?

Finally, my intuition would be that it makes sense for sit-for to redisplay. After all, if one has pressed a key and the intent is to insert it and do start some background stuff, it makes some sense to show the effects of that keypress to the user while Emacs "sits" waiting for either that background stuff to finish and produce its effects or for another keypress to come in. I don't see the inherent inefficiency of redisplay here, nor how non-faux "proper async" could somehow solve what seems to be a property of the problem statement itself.

Believe it or not, simply having tool-bar-mode enabled also counts as "fancy modeline".

Interesting. I'm using a tty emacs now, and I turn that off in GUI emacs.Tool bar is pretty ugly anyway, but that's quite besides the point :sweat_smile:

joaotavora commented 3 years ago

Where should I hackingly insert it to see the problem go away?

If I add (redisplay) at the beginning of company--fetch-candidates, this issue will not appear.

Very interesting, turns out the answer is just there, in this @blahgeek comment. I did this and the sly.el lag (which is apparently not the one-keystroke lag reported here) is also fixed.

In fact, looking at company--fetch-candidates it also explains why tweaking NODISP in SLY's sit-for has no effect: that function binds inhibit-redisplay to t. If I remove that binding inhibit-redisplay, then the problem is also fixed. And somehow it still works if I pass NODISP as t.

So I'm leaning that company--fetch-candidates should either call redisplay itself, or at least don't inhibit the backend's redisplay attempts. It doesn't make much difference with fast backends, but is quite a more pleasant experience with slow ones. Is there an issue in the company repo to track this?

joaotavora commented 3 years ago

As a final bit of investigation, the lag I is also easily fixed outside company by simply binding inhibit-redisplay to nil around the sit-for. And it again works with nodisp = t, which suggests that some of its callees do some redisplay unconditionally (perhaps Fread_event?).

dgutov commented 3 years ago

Very interesting, turns out the answer is just there, in this @blahgeek comment. I did this and the sly.el lag (which is apparently not the one-keystroke lag reported here) is also fixed.

Guess I should have read the full thread first. :facepalm:

I was also operating on the assumption that the opened PR does fix the problem.

Is there an issue in the company repo to track this?

There is an issue, yes. It links to this one, so it's visible among the comments here.

And it again works with nodisp = t, which suggests that some of its callees do some redisplay unconditionally (perhaps Fread_event?).

It was probably the reason for the use of inhibit-redisplay. And git blame points to https://github.com/company-mode/company-mode/issues/865 as one of the reasons for that code's existence. Too bad the video is gone now.

But the description sounds like the "flickering" @blahgeek has mentioned in the PR thread.

blahgeek commented 3 years ago

I can confirm that eglot does not have such issue.

I'm so sorry, this statement was wrong. eglot have the exact same issue. I got the wrong result because during the previous debugging, I've modified the inhibit-redisplay to nil in company.el, which would solve the issue for eglot.

So, essentially, it's the same behavior as described by @joaotavora :

Now, when the server is sluggish (i've added a 1 sec wait server side), there is indeed lag on the first keypress while waiting for the reply from the server

As a final bit of investigation, the lag I is also easily fixed outside company by simply binding inhibit-redisplay to nil around the sit-for.


That's the issue under discussion (IIUC).

Hmmm, I thought it was an odd off-by-one-like thing as described in the original post. I don't see anything like that at all (but I haven't ever seen the original one)

I think they are the same issue. It's just this happens for every keystroke with lsp-mode in some scenarios (may related to #2483), but it only happen at first (or second?) keystroke with eglot, hence the difference. But that's not really important for this issue.

blahgeek commented 3 years ago

So to summarize:

My personal proposal for lsp-mode is to do the redisplay in a timer, as mentioned in https://github.com/emacs-lsp/lsp-mode/issues/2758#issuecomment-817258467.

At the same time, it would be better for company-mode to keep the current completion result (or update the completion result using the previous cached result?) during company--continue, so that extra redisplay would not introduce flickering. @dgutov any ideas?

joaotavora commented 3 years ago

I think they are the same issue. It's just this happens for every keystroke with lsp-mode in some scenarios (may related to #2483), but it only happen at first (or second?) keystroke with eglot, hence the difference. But that's not really important for this issue.

If there's a difference in the number of keystrokes this effects, as you point out, it can't really be the "same" issue, can it? But there may be causes in common.

So my conclusions are different:

I actually think that maybe inhibit-redisplay should stay in company.el: it's probably meant to prevent flickering for very fast company-capf backends, which maybe are the majority (but then why are these backends calling redisplay and making the flickering?)

Another way is to add another :company-thingy key to company-capf.el.

blahgeek commented 3 years ago

@joaotavora

I think they are the same issue. It's just this happens for every keystroke with lsp-mode in some scenarios (may related to #2483), but it only happen at first (or second?) keystroke with eglot, hence the difference. But that's not really important for this issue.

If there's a difference in the number of keystrokes this effects, as you point out, it can't really be the "same" issue, can it? But there may be causes in common.

Yes I think we meant the same thing. They share the same root cause.

So my conclusions are different:

* there is input some lag with everything using `company.el`'s `company-capf` asynchronously.  Depending on the technique used the lag may manifest itself once (Eglot's first keypress) or multiple times (the one-keystroke-behind that @blahgeek seems to experience).

* This lag is because company is because `redisplay` shortly after a keypress isn't being called.  But it must, somehow.  How else would one see it?

* The reason it's not is being called because `company` is preventing it, the backend is failing to ensure it gets called, or both.

Yes, I agree. I meant the same.

* In `sly.el` and `jsonrpc.el` I'm going to bind `inhibit-redisplay` to `nil` around `sit-for`.  Not sure if I should pass `nodisp` to the `sit-for` 's

* I'd still recommend `sit-for` + `throw` + `catch` (or just `jsonrpc.el`) as the elegant technique for solving the responsiveness problem. But good luck with the timer.

I also support this. This is essentially also what https://github.com/emacs-lsp/lsp-mode/pull/2772 is doing for lsp-mode (make sure redisplay is called). However it may introduce flickering as I mentioned.

I actually think that maybe inhibit-redisplay should stay in company.el: it's probably meant to prevent flickering for very fast company-capf backends, which maybe are the majority (but then why are these backends calling redisplay and making the flickering?)

Another way is to add another :company-thingy key to company-capf.el.

For this part, I have different opinion. I think the proper fix for company-mode is to keep the current completion result, so that the immediate redisplay does not introduce flickering.

joaotavora commented 3 years ago

For this part, I have different opinion. I think the proper fix for company-mode is to keep the current completion result, so that the immediate redisplay does not introduce flickering.

I see. Makes sense. Indeed the only way to to avoid flickering but not so much as to show monstruous lag is to use a low-pass filter, in the form of a finely set timer that calls (let ((inhibit-redisplay nil)) (redisplay)) if it hasn't been cancelled yet. Maybe I'll add that to jsonrpc.el too, or maybe it would be baked into sit-for, dunno. This flickering in Eglot, doesn't seem like much of a problem to me though. Becasue servers are generally slowish (around 0.2 - 0.5s in my case), I prefer to see the key I typed immediately being registered.

dgutov commented 3 years ago

@blahgeek

However, it seems that company-mode would close the previous completion popup before making the new request

Unfortunately, there is a technical reason for that: the overlay-based popup breaks many commands' execution unless the popup is "hidden" in pre-command-hook (if you try to forward-char while the region near point is invisible, the result is not what you expected). So we necessarily hide the overlay in pre-command-hook and show it again at the end of post-command-hook. Perhaps the conversation could be different is we could migrate to a different mechanism for all users (posframe for GUI, something else in the terminal), but that doesn't seem to be happening yet, and posframe-based frontend still have their share of glitches, as well as lower performance.

Now, we could show the popup again at the beginning of company--fetch-candidates, but that raises a few questions. First is: how to write it in a way that's useful not only for "faux async" backends? Because they take the code path for "synchronous" backends, which are supposed to return their result quickly. So the second question would be is whether it's possible to differentiate between the sync and "faux async" somehow, because re-rendering the popup, redisplaying, then doing it again after we received the results will be suboptimal. And it's not like synchronous backends are going out of style. "Fast" LSP servers could probably also be counted in that category.

"Redisplay on a timer" could be a good compromise, except the popup will still be hidden, unless company--fetch-candidates only re-renders it unconditionally but doesn't redisplay itself (also extra overhead, about 30-40 ms, if I'm measuring it correctly). I suppose the timer could also call (company-call-frontends 'post-command) itself, making that action specific to "faux async" backends. It's a hack, though, another minor goodbye to being frontend-agnostic.

At the same time, it would be better for company-mode to keep the current completion result

company-candidates is updated later, so it should still have the previous value at that point.

dgutov commented 3 years ago

@joaotavora

In Eglot and SLY I believe this has been a problem since February 2019, where inhibit-redisplay was moved "higher" in company--fetch-candidates (company commit 2b671ecb4644b3b5714448197070ef96c67e243b). This explains why I didn't see the problem when developing the technique in sly.el in 2018/2019.

Just a reminder: the "problem" is the fix which you yourself suggested in the 2019 issue I linked to (invented in parallel to my own solution for flickering for async backends).

but then why are these backends calling redisplay and making the flickering?

https://github.com/company-mode/company-mode/issues/865#issuecomment-460361590

Indeed the only way to to avoid flickering but not so much as to show monstruous lag is to use a low-pass filter, in the form of a finely set timer that calls (let ((inhibit-redisplay nil)) (redisplay)) if it hasn't been cancelled yet.

It's a fine intermediate solution, but without re-displaying popup the flickering is going to be there.

Here's a dirty/experimental patch that does both, but the benchmarking block reports that it takes up to 20ms:

diff --git a/company.el b/company.el
index d9d744d..26045ac 100644
--- a/company.el
+++ b/company.el
@@ -1273,13 +1273,16 @@ update if FORCE-UPDATE."

 (defun company--fetch-candidates (prefix)
   (let* ((non-essential (not (company-explicit-action-p)))
-         (inhibit-redisplay t)
+;         (inhibit-redisplay t)
          (c (if (or company-selection-changed
                     ;; FIXME: This is not ideal, but we have not managed to deal
                     ;; with these situations in a better way yet.
                     (company-require-match-p))
                 (company-call-backend 'candidates prefix)
               (company-call-backend-raw 'candidates prefix))))
+    (benchmark-progn
+      (when company-candidates (company-call-frontends 'post-command))
+      (redisplay))
     (if (not (eq (car c) :async))
         c
       (let ((res 'none))
dgutov commented 3 years ago

The problem does seem fundamental: either we show the newly typed character only after the (possibly long) computation is finished, or we show some old/unfinished state before that happens.

Could someone describe the approach VS Code took regarding that? I'm guessing it also shows the previous completions list until it receives the response. Or perhaps it expects that all backends do fuzzy matching, so it caches the results from the first response and does subsequent filtering on the client. And flickering is not an issue anyway before the popup is shown for the first time.

But it has to deal with "incomplete" results too, right?

One alternative approach that would simplify things is that we mandate, one way or another, that when the popup is visible (i.e. we have already received the first completion result for the given context), the subsequent requests in that completion session have to be made synchronously. Which could work if all language servers (or all major ones, at least) are smart enough to cache the original result set and just re-filter it (quickly). But any exceptions will result in poor ux.

Or we go back to the "show the previous results" option, and try to figure out a way to only do that extra redisplay when it makes sense.

dgutov commented 3 years ago

The latter might look like

diff --git a/company.el b/company.el
index d9d744d..afe99d0 100644
--- a/company.el
+++ b/company.el
@@ -1274,6 +1274,7 @@ update if FORCE-UPDATE."
 (defun company--fetch-candidates (prefix)
   (let* ((non-essential (not (company-explicit-action-p)))
          (inhibit-redisplay t)
+         (refresh-timer (run-with-timer 0.005 nil #'company--sneaky-refresh))
          (c (if (or company-selection-changed
                     ;; FIXME: This is not ideal, but we have not managed to deal
                     ;; with these situations in a better way yet.
@@ -1281,7 +1282,9 @@ update if FORCE-UPDATE."
                 (company-call-backend 'candidates prefix)
               (company-call-backend-raw 'candidates prefix))))
     (if (not (eq (car c) :async))
-        c
+        (progn
+          (cancel-timer refresh-timer)
+          c)
       (let ((res 'none))
         (funcall
          (cdr c)
@@ -1298,10 +1301,16 @@ update if FORCE-UPDATE."
         (while (member (car unread-command-events)
                        '(company-foo (t . company-foo)))
           (pop unread-command-events))
+        (cancel-timer refresh-timer)
         (prog1
             (and (consp res) res)
           (setq res 'exited))))))

+(defun company--sneaky-refresh ()
+  (when company-candidates (company-call-frontends 'post-command))
+  (let (inhibit-redisplay)
+    (redisplay)))
+
 (defun company--flyspell-workaround-p ()
   ;; https://debbugs.gnu.org/23980
   (and (bound-and-true-p flyspell-mode)
joaotavora commented 3 years ago

Just a reminder: the "problem" is the fix which you yourself suggested in the 2019 issue I linked to (invented in parallel to my own solution for flickering for async backends).

Indeed, by that time was no longer using slow backends and didn't measure the consequences of my suggestion. In my defense I did propose at some point "Perhaps it could delay redisplay a bit to sometimes avoid a redisplay cycle when the dialog isn't ready." . Well that's what I'm proposing again now: delay redisplay a bit.

company-mode/company-mode#865 (comment) (in response to "why are the fast backends doing redisplay")

That explains the how, but doesn't explain the why. That it should be possible doens't mean it should be pervasive.

It's a fine intermediate solution, but without re-displaying popup the flickering is going to be there.

I don't know if you followed my suggestion, which could be to keep company unchanged, and then make the backends calling sit-for be a little more careful under these rules "if you predict you're going to take a long time waiting for completions, make sure to redisplay after a short while". I've tried that with a timer in the backend and it removed flickering for very fast backends, and still shows responsiveness for for slow ones - a low pass filter. And it's fine for all frontends. Maybe it should be a behaviour put into sit-for which seems to be designed with the need for redisplaying in mind.

Of course, the same thing can be done in company, maybe it would be cleaner. (I don't know why the extra post-command is for but that's fine)

Also I'd like to note that your characterization of these process-based completion backends as "faux" seems to stem from a perspective based solely on company, where the authors have invented something they called async and treat company-capf with some underlying assumptions that don't 100% match reality. In those situations, you say the backend is "faux", the french word for "fake", somehow associating it with illegitimacy. But that's just your opinion. The very same SLY backend works fine with Helm, icomplete and fido-mode for example (though inside the minibuffer), where no flickering is seen and no lag is seen. In fact, when the new icomplete-vertical-mode is used, it's really visually similar to company. Perhaps company could take a page from that frontend's technique.