Constantly growing backlog of `nrepl-pending-requests`

yuhan0 commented 2 years ago

Expected behavior

I'm not sure what this variable is meant to be, but it shouldn't be constantly growing in size and taking up memory.

Actual behavior

The buffer-local hashtable variable nrepl-pending-requests in the cider REPL buffer gets filled up with entries that never get cleared. Over the course of a long running session (multiple weeks) I saw it grow to tens of thousands of entries, which affected performance and crashed Emacs when I attempted to print it out.

Steps to reproduce the problem

Jack in to an empty repro project, and cider-load-buffer a few times on the following file, triggering errors and the cider stacktrace buffer to pop up.

(ns repro)
(/ 1 0)

M-x cider-switch-to-repl-buffer M-x describe-variable -> nrepl-pending-requests

Observe that the variable has accumulated entries for each time an error was produced.

#s(hash-table size 65 test equal rehash-size 1.5 rehash-threshold 0.8125 data
              ("6" cider--debug-response-handler "10"
               #[257 "\205\301\302!\207"
                     [cljr--debug-mode message "Artifact cache updated"]
                     3 "\n\n(fn _)"]
               "16"
               (closure
                ((causes . t)
                 cider-required-middleware-version t)
                (response)
                (setq causes
                      (cider--handle-stacktrace-response response causes)))
               "30"
               (closure
                ((causes . t)
                 cider-required-middleware-version t)
                (response)
                (setq causes
                      (cider--handle-stacktrace-response response causes)))
               "32"
               (closure
                ((causes . t)
                 cider-required-middleware-version t)
                (response)
                (setq causes
                      (cider--handle-stacktrace-response response causes)))))

This is not restricted to stacktrace errors, which I've just chosen as the easiest way to trigger the issue. Over time, this hashtable will be filled with all sorts of entries, eg. what appear to be completion requests and cider-eldoc messages, among other things.

I do not know the purpose or context behind the existence of this variable, but this comment suggests that there was a plan for its removal: https://github.com/clojure-emacs/cider/blob/e86f2f74f65eaa3aa2dd088ad2e0b1426a04c3e1/nrepl-client.el#L737

Environment & Version information

CIDER version information

;; CIDER 1.4.0 (Kyiv), nREPL 0.9.0
;; Clojure 1.10.1, Java 18.0.1

Lein / Clojure CLI version

Clojure CLI version 1.11.1.1113

Emacs version

28.0.91

Operating system

macOS 12

JDK distribution

openjdk version "13.0.2" 2020-01-14

bbatsov commented 2 years ago

The purpose of this variable was to keep track of evaluations that might still be producing some output even after they returned some result. Basically there was this problem that there evaluation handler would disappear and this would result in mistakes. I always knew this was an ugly hack and I recall I planned to replace it with some fallback handler, but I don't remember if I ever did it or not.

A lot time ago I planned to rewrite the whole evaluation logic and I expected this was going to go away in the improved version, sadly I never got to doing this. That's one of the tickets related to this for context https://github.com/clojure-emacs/cider/issues/1099

vemv commented 1 year ago

The purpose of this variable was to keep track of evaluations that might still be producing some output even after they returned some result.

I reckon that a given request is good to clear if one of its responses has returned a status?

bbatsov commented 1 year ago

@vemv You can have an eval return done but keep producing output. That's why this is harder to fix than it seems. I'm guessing we can trim the list at some point (e.g. once it reaches some size) and we can rely on the default handler in the absence of something specific to a particular ID. This would work fine in most cases.

vemv commented 1 year ago

(Thanks!)

A complementary strategy could be to use op allowlists/banlists.

The subset of ops that can eval (and therefore keep producing output) is fairly well-known. We also can reason about the cider-nrepl ops that are not expected to eval code or emit unexpected output.

For those, we can dissoc them on status safely.

vemv commented 1 year ago

@bbatsov could nREPL ops describe themselves with extra metadata?

e.g. they could return :does-eval <boolean>. We would update all ops nrepl / cider-nrepl side, then the cider side would be generic.

bbatsov commented 1 year ago

I think list tracks only eval requests, so I'm not sure what exactly are you proposing here. eval is the only problematic request that I can think of.

bbatsov commented 1 year ago

(I agreed to suppress the error with the broken REPL history simply because this was breaking the REPL init process, which I consider a special case)

vemv commented 1 year ago

I think list tracks only eval requests, so I'm not sure what exactly are you proposing here. eval is the only problematic request that I can think of.

From a quick observation, it tracks a bit of everything.

Technically, a variety of ops can eval, e.g. load-file. (To be fair, it's bad practice to have any sort of compile-time side-effects that would emit something to stdout. But it happens!)

bbatsov commented 1 year ago

Ah, yeah - I had forgotten about load-file. Yeah, it suffers from the same problems as eval.

clojure-emacs / cider