[completion] Cache last result of `cider-complete`

alexander-yakushev commented 4 months ago

I've been spending much more time making sense of Emacs completion machinery than it is worth, but at this point it's too late for me to abandon it. Apart from other findings, I discovered that Emacs' completion-at-point can call its completion-at-point-functions multiple times during a single invocation. I've never noticed it before because Company is smarter about that and usually issues a single call to the backend. But with the default settings (three completion styles – (basic partial-completion emacs22)), on a prefix that yields no completion candidates, completion-at-point can yank the "complete" op 4-6 times. As much as I tried to fight it by reducing the number of enabled styles, I could only drop it down to 2. Emacs itself recognizes that this API can call backend many times and encourages caching the response, hence the existence of this function: https://www.gnu.org/software/emacs/manual/html_node/elisp/Programmed-Completion.html#index-completion_002dtable_002dwith_002dcache.

This PR proposes a simple cached cider-complete implementation that remembers the last result and can give it back without touching the backend. The "key" of the cache includes the prefix, the buffer where the completion was initiated, and the point in the buffer. I also thought about adding nrepl-request-counter to the key, so that the cache is properly invalidated if user evaluated something that might change the completion results. But I noticed that eldoc ops often like to squeeze between the calls to complete, rendering the cache useless. So I need advice on this.

I have a feeling that a stale incorrect cache may be quite annoying, so maybe adding an expiration time to the mix could help?

Overall, when the completion is triggered on a large list of candidates (e.g. the prefixcom.), it feels significantly snappier with the cache. Let alone if you call it manually the second time. This would be doubly true when connecting to a remote REPL.

vemv commented 4 months ago

Kudos for the investigation!

I have a feeling that a stale incorrect cache may be quite annoying, so maybe adding an expiration time to the mix could help?

Yes, serving stale completions would seem very unfortunate in environment like Clojure where interactive redefinition is encouraged.

Even more so as CIDER doubles down on emphasizing runtime-gathered insights. Guaranteedly serving 'latest and greatest' makes us different from others.

Therefore agressive expiration times would seem crucial. I'd suggest something like two seconds - enough to prevent same-request redundancy, but not enough to possibly get stale.

An alternative to using expiration times is using a mixture of dynamic scope and memoization. The idea is that memoization lives during the timespan of a completion interaction, but not any longer.

...Very precise, GC-friendly and can have a clean fallback to the non-memoized implementation.

alexander-yakushev commented 4 months ago

An alternative to using expiration times is using a mixture of dynamic scope and memoization. The idea is that memoization lives during the timespan of a completion interaction, but not any longer.

This would be ideal, but we don't control the entrypoints to the completion action, and I couldn't find any sort of flag that the entrypoints would set that could help us tell apart same-session completions from newly triggered ones.

Therefore agressive expiration times would seem crucial. I'd suggest something like two seconds - enough to prevent same-request redundancy, but not enough to possibly get stale.

I also thought of 2 seconds. My only reservation is that on a really slow remote connection the cache could expire within the same completion event, and that sort of environment is where caching is needed the most. But I guess it is the best we can do.

bbatsov commented 4 months ago

This PR proposes a simple cached cider-complete implementation that remembers the last result and can give it back without touching the backend. The "key" of the cache includes the prefix, the buffer where the completion was initiated, and the point in the buffer. I also thought about adding nrepl-request-counter to the key, so that the cache is properly invalidated if user evaluated something that might change the completion results. But I noticed that eldoc ops often like to squeeze between the calls to complete, rendering the cache useless. So I need advice on this.

I think you can also trigger the invalidation when the track-state middleware reports changes. I think that's the safest way to do it, although it means it won't be triggered when cider-nrepl is not around. Probably it won't hurt to have some trivial command to reset the cache in case someone runs into trouble, although for a single cached value that would probably be an overkill. I was wondering if it won't be optimal to cache the last 10 completions or something like this.

alexander-yakushev commented 4 months ago

I think you can also trigger the invalidation when the track-state middleware reports changes.

That's a good suggestion; I also thought in the direction of track-state. Could you give a hint how that check would look like?

I was wondering if it won't be optimal to cache the last 10 completions or something like this.

I really don't think it is necessary. The reason we want to cache completions is not because they are too expensive for us to compute, but only to save the superfluous calls (and thus, latency) because of the peculiarities of Emacs API. I think that saving the latest one would be completely enough. I also think that even relying on track-state, we should still have an expiration, something like 5 seconds, to be safe from possible mishaps in track-state and avoid having a dedicated cache flush command for this very technical internal thing.

bbatsov commented 4 months ago

That's a good suggestion; I also thought in the direction of track-state. Could you give a hint how that check would look like?

See cider-repl--state-handler. You can see the part there where we invalidate the namespace and the eldoc cache (which I've added a long time ago because I had observed that eldoc triggered too many requests, similar to completion).

(defun cider-repl--state-handler (response)
  "Handle server state contained in RESPONSE."
  (with-demoted-errors "Error in `cider-repl--state-handler': %s"
    (when (member "state" (nrepl-dict-get response "status"))
      (nrepl-dbind-response response (repl-type changed-namespaces session)
        (when (and repl-type
                   cider-repl-auto-detect-type
                   ;; tooling sessions always run on the JVM so they are not a valid criterion:
                   (not (equal session nrepl-tooling-session)))
          (cider-set-repl-type repl-type))
        (when (eq (cider-maybe-intern repl-type) 'cljs)
          (setq cider-repl-cljs-upgrade-pending nil))
        (unless (nrepl-dict-empty-p changed-namespaces)
          (setq cider-repl-ns-cache (nrepl-dict-merge cider-repl-ns-cache changed-namespaces))
          (let ((this-repl (current-buffer)))
            (dolist (b (buffer-list))
              (with-current-buffer b
                (when (or cider-mode (derived-mode-p 'cider-repl-mode))
                  ;; We only cider-refresh-dynamic-font-lock (and set `cider-eldoc-last-symbol')
                  ;; for Clojure buffers directly related to this repl
                  ;; (specifically, we omit 'friendly' sessions because a given buffer may be friendly to multiple repls,
                  ;;  so we don't want a buffer to mix up font locking rules from different repls).
                  ;; Note that `sesman--linked-sessions' only queries for the directly linked sessions.
                  ;; That has the additional advantage of running very/predictably fast, since it won't run our
                  ;; `cider--sesman-friendly-session-p' logic, which can be slow for its non-cached path.
                  (when (member this-repl (car (sesman--linked-sessions 'CIDER)))
                    ;; Metadata changed, so signatures may have changed too.
                    (setq cider-eldoc-last-symbol nil)
                    (when-let* ((ns-dict (or (nrepl-dict-get changed-namespaces (cider-current-ns))
                                             (let ((ns-dict (cider-resolve--get-in (cider-current-ns))))
                                               (when (seq-find (lambda (ns) (nrepl-dict-get changed-namespaces ns))
                                                               (nrepl-dict-get ns-dict "aliases"))
                                                 ns-dict)))))
                      (cider-refresh-dynamic-font-lock ns-dict))))))))))))

vemv commented 4 months ago

I think you can also trigger the invalidation when the track-state middleware reports changes. I think that's the safest way to do it, although it means it won't be triggered when cider-nrepl is not around.

It also means that typically it won't work when it's needed the most (most times a slow network connection means connecting to a production instance, and it's not generally recommended to run cider-nrepl there)

we should still have an expiration, something like 5 seconds

We could observe the host of the nrepl connection - if it is localhost, make the timeout tighter

bbatsov commented 4 months ago

I still think it's best to stick to track state as that's what we use for similar caches so far. It's always somewhat confusing when a project starts doing different things for similar purposes. Consistency helps with maintainability.

It also means that typically it won't work when it's needed the most (most times a slow network connection means connecting to a production instance, and it's not generally recommended to run cider-nrepl there)

Well, you'll also have problems with the namespace cache and the eldoc cache as well in this instance, so at least it's all the same for the user. :-) Perhaps there should be some generic fallback in the absence of track-state, but that's a different discussion IMO. (e.g. we can check if there are definitions in what's being passed to eval or implement something similar to track-state in nREPL itself)

vemv commented 4 months ago

Robe uses completion-table-with-cache which @alexander-yakushev mentioned in OP

https://github.com/dgutov/robe/blob/6bc8a07fc483407971de0966d367a11006b3ab80/robe.el#L949-L952

It might as well cut it / be idiomatic?

alexander-yakushev commented 4 months ago

Robe uses completion-table-with-cache which @alexander-yakushev mentioned in OP

That function caches only by prefix no other conditions, so it does not suit us.

alexander-yakushev commented 4 months ago

We could observe the host of the nrepl connection - if it is localhost, make the timeout tighter

That's a weak heuristic – the host is often still localhost because the remote REPL is exposed via SSH forwarding.

But honestly, I wouldn't bother. It's better to pick a single value for the expiration (e.g. 5 seconds) and add track-state to the mix, for cases where cider-nrepl is available and user is actively developing things. If track-state is unavailable, then the cache will simply expire in 5 seconds – sounds like a sufficient trade-off.

vemv commented 4 months ago

That function caches only by prefix no other conditions, so it does not suit us.

Regardless, it looks like a practical example of dynamic scope being useful.

Is it not feasible to make a completion-table-with-cache-like function and use a similar pattern?

(n.b. Robe and Company come from the same author, so I'd 100% expect this to be a good idea)

alexander-yakushev commented 4 months ago

Actually, maybe I'm wrong about the scope of reuse of cider-complete-at-point by Emacs completion API. Let me recheck; perhaps, this is all easier than I thought.

alexander-yakushev commented 4 months ago

@vemv Yes, I'm stupid, the fix is coming. Thanks!

bbatsov commented 4 months ago

I'd suggest to try this for about a day - remember that cider master changes can reach users very quickly.

FWIW - that's not really an issue in general as for over 10 years I've set the expectations that breakages might occur if people are tracking MELPA, but I've also promised them the adventurous users quick fixes. To quote the docs:

It’s important to note that MELPA packages are built automatically from the master branch, and that means you’ll be right on the leading edge of development. This has upsides and downsides; you’ll see new features first, but you might experience some bugs from time to time. Nevertheless, installing from MELPA is a reasonable way to obtain CIDER. The master branch is normally quite stable and serious regressions there are usually fixed quickly.

For me MELPA has always been another testing/feedback channel, so I rarely hesitate to release things there.

vemv commented 4 months ago

I'm aware of that, but I'd try to decouple "users can test this out" from "we have taken sufficient measures to believe this code is correct".

In other words, I try to not use that MELPA intricacy as an excuse to not try our own stuff.

Breaking users' workflow is never fun and not only affects them but also us (in form of support time).

In practice, I asked for a small delay - during which I can apply the same patch and use it for an entire day of work. Doesn't seem a huge ask.

bbatsov commented 4 months ago

In other words, I try to not use that MELPA intricacy as an excuse to not try our own stuff.

Well, it's always my expectation that the people filing PRs have actually tested them out. :-) I don't really have the time to test every submission we get myself and I've rarely done it.

In practice, I asked for a small delay - during which I can apply the same patch and use it for an entire day of work. Doesn't seem a huge ask.

I'm fine if you want to personally test it today, I'm just saying that I don't really view potential breakages on master as something very problematic and that MELPA is staging ground where some instability is to be expected. That being said, we can merge this tomorrow.

The PR itself seems good to me, sans my small remark about the usage of prog1.

vemv commented 4 months ago

Well, it's always my expectation that the people filing PRs have actually tested them out.

Same, but normally more time and more eyes only help.

Anyway. I've applied the patch and looked at the nrepl logs. Seems fine to me so far!

alexander-yakushev commented 4 months ago

~Commited final changes and tested that it works.~

clojure-emacs / cider

[completion] Cache last result of `cider-complete` #3655