Closed tgbugs closed 2 years ago
Does your "Racket mode logger" buffer get substantially large?
Does your "Racket mode logger" buffer get substantially large?
No. There are very few messages in that buffer.
What is your gc-cons-threshold
?
80,000,000
but I tested with the default 800,000
and the behavior is the same.
Thanks for the report!
I'm only just starting to look into this, so apologies some of these might be dumb questions, but:
ps
or top
or htop
or similar? Which of the several memory-related stats (e.g. "VIRT" vs. "RES" vs. "SHR")?emacs
process (as opposed to the racket
process started by Racket Mode to run its back end server)?racket-xp-mode
(which does "check-syntax" based features)? If so, does disabling that change the situation?raco make
first seems to be a necessary ingredient? If so, that surprises me, and therefore potentially interesting.htop
RES
, and then memory-report
and ap/garbage-collect
https://www.reddit.com/r/emacs/comments/ck4zb3/trying_to_understand_emacs_memory_usage_and/evji1n7/.emacs
process.racket-xp-mode
, I need to double check, my recollection is that it did not make a difference, but that may have been only for whether disabling it reduced existing memory usage. Will report back on this one.raco make
first is necessary to trigger the issue, simply modifying the file and calling racket-run-module-at-point
does not seem to be sufficient.raco make
may not be the trigger.With respect to racket-xp-mode
the results are nuanced. When racket-xp-mode
is disabled I do not see lingering allocations of 10s of megabytes, however I do still see smaller lingering allocations on the order of 1mb per raco make
cycle. So I guess racket-xp-mode
causes the accumulation to start affecting gc pauses faster to the point where it is noticable to the user more quickly?
Ok, yet another report back ... so this time I have been able to produce the issue without running raco make
, everything the same but dropping raco make
. I may have missed that when I tried it last night.
It seems that I can produce by simply calling racket-run-module-at-point
over and over again in the same buffer, but sometimes the jump in memory usage isn't as large, only a couple of MB, instead of 10s.
Here are two plots showing slightly different behavior. In the first case I think that racket-xp-mode
only ran once. In the second case I think it ran over the buffer multiple times because I made changes to the buffer and because I could see the cpu usage spike. Only after a spike would I see one of the big jumps.
OK I'll see if I can reproduce using the file and repos you mentioned. Probably tomorrow.
To clarify, when I racket-run-module-at-point
, should I do so with point in the main
submodule, or the outermost file module? I'm guessing main
; maybe it doesn't even matter which module is run, but I wanted to ask.
I've done most of my testing with main
but the behavior also shows up when running in the outermost file module. I though it might be related to running main
directly, but I don't think it is after a couple rounds of checking. I will note that the 2nd graph above was produced by running main
.
So, good news and bad news. This behavior does not show up when running with a minimal config. However, when looking at the memory profile, it seems fairly clear that that is because the memory is being correctly gced. I am going to hunt through my config to see what might be causing this difference, because by default the memory is being collected correctly.
edit: gc-cons-threshold
does seem to be involved somehow as @capfredf suggested. At the default level we get the graph above, at the higher level, we get the graph(s) below (ignore the names, they were hypotheses that mostly turned out to be false), and calling garbage-collect
never manages to collect the alloctions. The only problem with the graph below is that I wasn't able to induce the growing memory usage despite trying. Looks like many factors are involved here. I'll keep looking.
And one last time for the bad news. Here is the plot with gc-cons-threshold
at the default 80k level with the rest of my full config. Definitely not gc-cons-threshold
in that case. Back to digging, now that I have some harderish data.
Argh. It looks like this is an emacs-29 bug. Here are the plots from the same minimal config. I'm confirming that it appears in an unmodified master branch. Looks like it is time for yet another bisect. I'm guessing this is not a racket-mode bug, but I'm going to leave this open for a bit longer until I can confirm.
emacs-28
emacs-29
@tgbugs - Sorry for being lazy and off topic - what is the code you are using to profuce the graphs both the emacs code to generate the data and the code to graph it.
@bestlem I'm using a python package called memory-profiler https://github.com/pythonprofilers/memory_profiler, it measures from outside emacs, and I run it with the following args.
mprof run ~/git/NOFORK/emacs/src/emacs -Q -batch -l emacs-memory-leak-test.el
mprof plot --backend TkAgg --title "emacs racket-mode ${commit}" --output ${fn}
Just to confirm: I set up all the "artifacts" you listed -- raco pkg install gui-widget-mixins
, and cloned the racket-breadcrumb
and racket-json-view
repos and installed those as local package links.
With Emacs 25.2, and racket-xp-mode
enabled, I then tried editing and running the viewer.rkt
file multiple times, and watching RES
in htop. It increased then stayed stable and level for me, with Emacs 25.2.
With Emacs which I last built from source at commit 6a4444aa390f20c195f864a7c1b8114d82ffdeb5 around June 7, RES
keeps climbing.
racket-xp-mode
adds a potentially large number of text properties to the buffer -- basically one for every span of interest reported by Check Syntax from the back end.
Furthermore the communication of the results from the Racket back end to Emacs is via writing a string that Emacs read
s as an s-expression. That list of annotation data will represent a fair number of cons
allocations. Even before getting to the point of inserting that information into the buffer as text properties.
Although I wouldn't necessarily describe it as an extreme stress test, it does exercise Emacs' allocation behaviors.
Although I don't think it's wrong to handle it this way, I can imagine handling it other ways. For example the back end could hold the annotations, and the front end could query it as point moves. This would reduce memory use by Emacs. But it would increase sustained memory use by the back end process, i.e. mostly just shift where it lives. Such an approach might feel more sluggish.
The above assumes that racket-xp-mode
is the main factor here. Possibly need to test more on both versions of Emacs with racket-xp-mode
disabled.
Bisect shows that the new reader is the issue.
https://github.com/emacs-mirror/emacs/commit/b903507b36c438653a02d7b6291e9744d5221e28
Ironically I had actually seen that commit before.
Last good
First bad
Depending on the nature of the read
change/bug, this probably explains why using racket-xp-mode
causes memory use to climb even more -- but you still see it climb somewhat, when using Racket Mode generally even w/o racket-xp-mode
.
In Racket Mode, the "wire format" -- for how the Emacs front end talks to the Racket back end -- is s-expressions. So for every "command" to the back end, there will eventually be an Emacs read
of the command response.
With racket-xp-mode
, the size of the responses will be larger than for most other commands (a list of potentially many dozens of annotations). Also, edits to the buffer result in an after-change-hook
that sends a "check-syntax" command; although these are "de-bounced" for 1 second idle time, the frequency of the commands can be relatively high when heavily editing a buffer.
TL;DR: Racket Mode is a good way to flush out problems with Emacs read
, and racket-xp-mode
is even better.
In case it helps to create a more-minimal example for https://debbugs.gnu.org/cgi/bugreport.cgi?bug=56623:
Using your viewer.rkt
example, I grabbed a "snapshot" of the kind of check-syntax response that Emacs is read
-ing: issue-632-read-data.el.
I had thought this could be used with some simple Emacs Lisp to be run using --batch -Q
, like the following:
(pp (garbage-collect))
(dotimes (_ 5)
(with-temp-buffer
(insert-file-contents "issue-632-read-data.el")
(read (current-buffer))))
(pp (garbage-collect))
But in a very quick try that seems to produce similar results for both Emacs 25.2 and https://github.com/emacs-mirror/emacs/commit/b903507b36c438653a02d7b6291e9744d5221e28. So that's not quite it, yet. But meanwhile I wanted to go ahead and share the example data.
This minimal(ish) example might be closer??
I included more of what racket-xp-mode
actually does (although I'm not sure these are necessary elements):
read
string values in a buffer.Also this prints the increase over initial memory-limit
(which isn't the same thing as htop
RES
but it's... something to measure).
;; issue-632-test-driver.el
(defvar buf (get-buffer-create "foo"))
(with-current-buffer buf
(insert (make-string 65536 32)))
(message "created initial buffer")
(defvar initial-memory-limit (memory-limit))
(message "Watch RES in htop")
(message "will print memory-limit here")
(run-with-timer
5
1
(lambda ()
(pcase (with-temp-buffer
(insert-file-contents "issue-632-read-data.el")
(read (current-buffer)))
(`(check-syntax-ok
(completions . ,completions)
(imenu . ,imenu)
(annotations . ,annotations))
(with-current-buffer buf
(remove-text-properties (point-min) (point-max) (list 'help-echo nil))
(dolist (x annotations)
(pcase x
(`(info ,beg ,end ,str)
(put-text-property beg end 'help-echo str)))))
(message "%s" (- (memory-limit) initial-memory-limit))))))
(sleep-for 60)
(y-or-n-p "exit?")
Results:
Emacs 25.2 increases, then decreases somewhat, then increases.
$ emacs --batch -Q -l issue-632-driver.el created initial buffer Watch RES in htop will print memory-limit here 2012 2012 2012 2588 3212 3880 4420 4420 4420 4420 4904 5424 5944 6508 6988 7632 8160 8764 8760 8760 8760 8760 8760 8760 8760 8760 8760 9244 9728 9728 9728 9724 9724 9724 9724 9724 9724 9724 9724 9200 9200 9200 9200 9200 9200 9200 9200 9200 9200 9200 9776 9776 9776 9772 9772 exit? (y or n) y
Emacs 29.0.50 increases, and sometimes levels off, but never decreases:
greg@x1c:~$ ~/emacs/emacs-29.0.50/src/emacs --batch -Q -l issue-632-driver.el created initial buffer Watch RES in htop will print memory-limit here 1748 3644 5548 6436 7388 8360 9332 10200 10336 10932 11476 12312 13300 14016 14968 15856 16740 17612 18464 18464 18464 18464 18464 18464 18464 18464 18464 19060 19876 20832 20832 20832 20832 20832 20832 20832 20832 20832 21240 21648 22192 22192 22192 22192 22192 22192 22192 22192 22192 22192 22192 22328 22328 22328 22328 exit? (y or n) y
Actually, no. That is mainly measuring the effect of setting text properties in the buffer without using with-silent-modifications
. Therefore the measurements are dominated by text property allocation "noise".
With something like this, which does the normal thing like racket-xp-mode
does, and sets text properties inside with-silent-modifications
:
;; Create a buffer of spaces, that's long enough that we can set
;; properties from the example data.el.
(defvar buf (get-buffer-create "foo"))
(with-current-buffer buf
(insert (make-string 65536 32)))
(message "created initial buffer")
(defvar initial-memory-limit (memory-limit))
(message "Watch RES in htop")
(message "will print memory-limit here")
(run-with-timer
3 ;initial delay to give htop time to refresh and show us
0.25
(lambda ()
(pcase (with-temp-buffer
(insert-file-contents "/var/tmp/issue-632-read-data.el")
(read (current-buffer)))
(`(check-syntax-ok
(completions . ,completions)
(imenu . ,imenu)
(annotations . ,annotations))
(with-current-buffer buf
(with-silent-modifications ;; <=======
(remove-text-properties (point-min) (point-max) (list 'help-echo nil))
(dolist (x annotations)
(pcase x
(`(info ,beg ,end ,str)
(put-text-property beg end 'help-echo str))))))
(message "%s" (- (memory-limit) initial-memory-limit))))))
(sleep-for 60)
(kill-buffer buf)
(message "After killing buffer: %s" (- (memory-limit) initial-memory-limit))
It's focused (more) on the contribution of read
, as well as using some of the values in the buffer as text property values -- but not allocations related to buffer-modifications.
This gives results for Emacs 25.2 like simply printing a stable 1820
forever.
Whereas for Emacs 29.0.50, it does a climb a bit but soon stabilizes on the not-much-higher value of 6520
thereafter.
$ ~/emacs/emacs-28.0.50/src/emacs --batch -Q -l /var/tmp/issue-632-driver.el created initial buffer Watch RES in htop will print memory-limit here 1608 3452 4884 5432 5976 6112 6248 6248 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520 6520
Yes, that's different. Yes it's slightly "worse". But it hardly seems to capture, in a minimal example, the magnitude or flavor of the original report.
So I think for now I remain stumped how to boil this down to something related to read
, despite the bisect pointing at the substantial changes in that commit.
I think that the accumulating behavior is probably sufficient. I saw the same thing with my graphs for the automated repro.
The way I was getting the staircase was to manually modify the file, save it, watch my cpu as racket-xp-mode ran in the background, wait for the cpu to go back down, and then call racket-run-module-at-point
.
Will report back after more investigations, probably later in the week.
As I just emailed to the Emacs bug tracker, I think the problem here is that the new read
leaks when it errors. A minimal example:
;; Run this with e.g. emacs --batch -Q -l example.el
(message "Run htop and observe the RES value")
(dotimes (n 512)
(message "%s" n)
(with-temp-buffer
;; Insert an unclosed sexp
(insert "(")
;; Make it large, with quite a few sub-expressions to read
(dotimes (_ 8192)
(insert (format "%S" '(foo bar baz "a somewhat long string blah blah blah blah blah\n"))))
;; Go to start and attempt a read. This will fail due to the
;; unclosed paren, so we use `ignore-errors'.
(goto-char (point-min))
(ignore-errors (read (current-buffer)))))
;; With the old recursive lread.c, the RES value will oscillate around
;; a stable low value (presumably as GC happens).
;; With the new non-recursive lread.c, the RES value will climb
;; unbounded into hundreds of megabytes.
I think that needs to be fixed in Emacs.
Meanwhile I do have a commit that should avoid this, as well as (in any version of Emacs) be more efficient in the case where a large command response sexpr from the back end is not yet completely received.
EDIT: The commit will avoid the memory leak in read
when it errors because the sexp is incomplete (e.g. no matching closing paren) -- which is the common case, especially with racket-xp-mode
command responses. It won't avoid the leak when read
errors for some other reason (e.g. an unreadable symbol) -- but (a) that's extremely uncommon AFAIK, because the back end tries hard not to produce sexprs like that, and (b) that's why read
should still be fixed generally in Emacs.
Although I haven't yet fetched/rebuilt/verified it, https://github.com/emacs-mirror/emacs/commit/c0bb1aac102f8727484763370b154f635a6a1fd1 is intended to fix the memory leak.
I think the commit on my end is probably still worthwhile, and a silver lining from all this. When a large, incomplete s-expression is pending in the buffer that accepts command responses from the back end, it is a shame to attempt read
all its sub-expressions, only to fail and throw it away. Even when thrown away correctly (no memory leak), it's inefficient. It is probably a good idea to check first that a top-level complete list s-expression exists, using the relatively quick (IIUC) scan-lists
, before attempting to read
it.
Huge thanks @tgbugs for all the work to bisect and to collect and present timing data!
My Emacs has been on a relatively new version that includes the fix for a while. I haven't seen it take over 1GB of memory anymore.
Over the past year or so I have been occasionally encountering sporadic high memory usage in Emacs. I think I have finally tracked it down to running
racket-run-module-at-point
immediately after runningraco make
on the module/file.The behavior I observe is that when I modify a file, save it, and run raco make, and then call
racket-run-module-at-point
on it, I will see somewhere between 20MB and 60MB of additional memory being used between conses and strings. This will happen every time I raco make and rrmat, and that memory never gets collected, seemingly no matter what I do, including cleaning up all the racket mode processes and buffers (including hidden).I don't have a minimal repro yet, but I wanted to get this report in so that anyone else encountering it might see it.
A non-minimal repro requires editing this file https://github.com/SciCrunch/sparc-curation/blob/master/sparcur_internal/sparcur/viewer.rkt as described above. The setup is a bit convoluted because you need to manually install https://github.com/tgbugs/racket-json-view and https://github.com/tgbugs/racket-breadcrumb.