mmontone / emacs-inspector

Inspection tool for Emacs Lisp objects.
GNU General Public License v3.0
108 stars 9 forks source link

`print-circle` is not used, so circular structures cause enormous memory allocations, and general performance issues #30

Open alphapapa opened 1 year ago

alphapapa commented 1 year ago

Hi,

I've found that, since print-circle isn't used by inspector, inspecting objects that contain circular or repeated references can cause enormous amounts of memory to be allocated, sometimes to the point of exhausting all available virtual memory and invoking the OOM killer on Emacs.

For example, when inspecting the ement-syncs variable from my Ement.el Matrix client, which points to the ement-session struct for the session being synced, when the session had been connected for hours and had a decent amount of data stored in it, attempting to inspect it caused Emacs to use up all my RAM and swap and invoke the OOM killer.

After restarting Emacs, I tried binding print-circle like this:

(defun inspector--princ-to-string (object)
  "Print OBJECT to string using `princ'."
  (with-output-to-string
    (let ((print-circle t))
      (princ object))))

And when I inspected ement-syncs again, it seemed to perform better: it still allocated over 3 GB of memory, but that was a far cry from the ~32 GB it had before (although the Ement session was just-connected, so it didn't have as much data in it, so it's hard to compare the two attempts exactly).

More generally, it seems that the performance of the inspector doesn't quite make sense, given how the objects being printed are so shortly truncated. For example, inspecting the list of rooms in the ement-session object, I get a buffer showing 100 room structs, with the printed representation of each being truncated to 80 characters, yet Emacs takes over 30 seconds to prepare it and allocates 6.5 GB of RAM in the process.

I'm guessing that the problem is that each object is printed to a string and then the string is truncated; this must result in heavy consing and GC'ing, only to show 80 characters of text for an object potentially hundreds of MB in its full printed representation.

Unfortunately, this makes the inspector nearly useless for what ought to be its "star moment", inspecting objects that are too large to navigate easily by other means.

I have some vague ideas about how to improve on these issues, but I don't know how feasible they are, and I don't expect to have time to work on them anytime soon. But at least this issue will serve to note the problems, and maybe we can figure out good solutions. Emacs sorely needs a tool like this, but it needs to be able to handle large objects well.

Thanks for your work on this tool.

mmontone commented 1 year ago

Ok. Thanks for reporting. It is good to know this is an issue, and your explanation is very useful.

I'll have a look and see what can be done, and probably apply your patch in the meantime.

mmontone commented 1 year ago

30 seconds to prepare it and allocates 6.5 GB of RAM in the process.

wow, that's insane

mmontone commented 1 year ago

My tests didn't involve huge data sizes . If you have ideas on some big dataset I can use to test, that's welcomed too.

mmontone commented 1 year ago

I'm releasing a version with your patch, as it helps at least with this issue.

mmontone commented 1 year ago

inspector--print-truncated uses with-temp-buffer and I suspect that must be one of the possible causes of the problem.

alphapapa commented 1 year ago

My tests didn't involve huge data sizes . If you have ideas on some big dataset I can use to test, that's welcomed too.

Well, as I said, I discovered it when trying to inspect Ement's ement-session structure on a connected session. If you're willing to register a free account on matrix.org, you could easily try it for yourself. I can even help you join a few rooms and exchange a few messages to start.

I'm releasing a version with your patch, as it helps at least with this issue.

Thanks. Note, though, that I don't know how enabling print-circle will affect the results. That is, if a value is replaced with a circular reference, it would probably be helpful for that reference to be a link to the actual value.

inspector--print-truncated uses with-temp-buffer and I suspect that must be one of the possible causes of the problem.

Yes, I'm guessing that would lead to excessive GC. A simple and common mitigation technique is to use a single temp buffer for such a function and erase it before each use.

mmontone commented 1 year ago

inspector--print-truncated uses with-temp-buffer and I suspect that must be one of the possible causes of the problem.

Yes, I'm guessing that would lead to excessive GC. A simple and common mitigation technique is to use a single temp buffer for such a function and erase it before each use.

I'm going to try that.

mmontone commented 1 year ago

I've pushed an implementation. It feels fast, but I haven't measured with big data yet.

mmontone commented 1 year ago

I'm releasing a version with your patch, as it helps at least with this issue.

Thanks. Note, though, that I don't know how enabling print-circle will affect the results. That is, if a value is replaced with a circular reference, it would probably be helpful for that reference to be a link to the actual value.

It could be interesting to try a custom implementation of printing of circular structures for the inspector.