alphapapa / ement.el

A Matrix client for GNU Emacs
GNU General Public License v3.0
476 stars 44 forks source link

Preventing `savehist` from serializing large data structures which are arguments in `command-history` #216

Closed alphapapa closed 9 months ago

alphapapa commented 9 months ago

@phil-s discovered that savehist serializes command-history, which includes the arguments to interactive commands. So when it contains some Ement commands like ement-connect which receive an ement-session struct as an argument--a structure that encompasses all of a session's rooms, events, etc--it can become very large. And savehist-printable is extremely inefficient when writing hash tables, which slows down the process even further. And the session struct should never be serialized! (It was already tried in the past, and while it might be done in the future, it would be done carefully with regard to how much and which data is saved, and would ensure that everything still works after reading it back in.)

Lacking a good solution to this problem, it seems the best we can do is a workaround like this:

(with-eval-after-load 'savehist
  (defun ement--savehist-save-hook ()
    "Remove all `ement-' commands from `command-history'.
Because when `savehist' saves `command-history', it includes the
interactive arguments passed to the command, which in our case
includes large data structures that should never be persisted!"
    (setf command-history
          (cl-remove-if (pcase-lambda (`(,command . ,_))
                          (string-match-p (rx bos "ement-") (symbol-name command)))
                        command-history)))
  (cl-pushnew 'ement--savehist-save-hook savehist-save-hook))

I don't know if other variables might also cause a problem in this regard. For example, if a string with a text property pointing to one of our structs were to end up in minibuffer-history... ISTM that text properties should not be serialized by savehist for such variables, on principle.

There are some bugs on the tracker related to savehist and large data structures, but not specifically due to arguments in command-history.

Anyway, this finally explains the "pauses" that some users--only those using savehist--have noticed. When this manifests, savehist may write the file often (like every minute, and on Emacs exit), and end up writing hundreds of MB to a buffer, then reading it back in, just to test whether the value is readable, and then writing it out to disk--only to be uselessly read back in when Emacs starts again. And, if it does successfully read it back in--which I'm not sure about--who knows what chaos might ensue if ement-connect received a read-back-in ement-session struct as its argument!

The only other workaround I can think of would be to only pass, e.g. strings as arguments to our interactive commands, which would mean a lot of extra, ugly code having to look up objects in maps by key. And I don't consider that acceptable at all.

Well, hopefully we can have this workaround present in v0.12. Just need some users who have actually experienced this problem to report whether it solves it...

P.S. According to this, it already strips text properties from strings.

alphapapa commented 9 months ago

See also: