skeeto / elfeed

An Emacs web feeds client
The Unlicense
1.52k stars 120 forks source link

Don't look for empty content in the compressed archive #525

Open bram85 opened 3 months ago

bram85 commented 3 months ago

Some feeds never have any content in their feed, so they would have a SHA1 reference of da39a3ee5e6b4b0d3255bfef95601890afd80709 for all their entries.

elfeed-deref would look up this hash in the compressed archive first, will likely find it, decompress the archive only to return an empty string (car index) == (cdr index). This is quite some overhead only to return an empty string. I'd propose to make elfeed-ref return early whenever we face a hash that represents an empty string.

For what it's worth, I worked around it by advising elfeed-deref with the following code:

 (defun bram85-elfeed-deref-optimization (f &rest args)
    "Optimize for empty entry content.

  This is an advice meant for `elfeed-deref' (argument F) with a
  reference ID in ARGS. This advice intercepts a reference to empty
  content and immediately returns the empty string. `elfeed-deref'
  would likely find the empty content inside archive.gz, decompress
  it and only then return an empty string.

  Some feeds structurally have empty content, inducing a lot of
  overhead otherwise."
    (let ((sha1-empty-string "da39a3ee5e6b4b0d3255bfef95601890afd80709")
          (ref-id (elfeed-ref-id (car args))))
      (if (equal ref-id sha1-empty-string)
          ""
        (apply f args))))

  (advice-add 'elfeed-deref :around #'bram85-elfeed-deref-optimization)

(Source)