l3kn / org-el-cache

Persistent cache for data derived from org-elements
GNU General Public License v3.0
9 stars 0 forks source link
emacs org-mode

+TITLE: Org Element Cache

A cache for these objects is already included in org-mode and disabled by default.

However, when working with large numbers (> 1000) of files, populating and processing this cache for each file takes a long time.

This package tries to alleviate this problem by:

  1. Allowing users to register hooks that compute some data form an org element
  2. Persisting the results on disk
  3. Taking care of keeping the cached data in sync with the actual buffer contents

In the current version, only hooks on a file-level are implemented. Future versions might include a way to register hooks for element types.

  1. A hash table with a plist for each file managed by the cache
  2. A list of folders used to find files managed by the cache
  3. A list of hooks to generate cached data
  4. A file to persist the cache in

** Populating the Cache A cache is populated with entries by opening each file in a temporary buffer, parsing it to an ~org-element~ and passing this element to each of the hooks.

Doing this for the first time can take a few seconds, depending on the number of files in the cache.

Once the cache has been saved to disk for the first time, a combination of the ~find~ and ~sha1sum~ shell commands is used to determine which entries need to be updated.

Assuming you're only using one emacs instance at a time, updating the cache on startup should take only a few milliseconds.

+begin_src emacs-lisp

(def-org-el-cache my-cache ;; name of the cache / variable to store it in (list "~/org") ;; directories managed by this cache "~/org/.cache.el" ;; file to persist the cache in )

+end_src

** Adding Hooks Hooks can be added to a cache with the ~org-el-cache-add-hook~ function.

+begin_src emacs-lisp

(org-el-cache-add-hook my-cache ;; Cache to add the hook to :my-property ;; Property name to use for this hook (lambda (filename element) ;; Do something with the element and return some value ))

+end_src

** Updating Caches

If you changed the definition of a hook or added a new one, use ~org-el-cache-force-update~ to re-initialize the cache. ** Accessing Cached Data

The following functions work on all entries of a cache. For more information on them, refer to the functions documentation (e.g. ~C-h f org-el-cache-map~).

Usually, there is no need to manually persist or load a cache. If you want to do so anyway, you can use the following functions:

For more information, you can check out the integration tests in [[file:org-el-cache-test.el]].

This means that there is no (elegant) way to cache markers in files, e.g. when using cached headline data for org-agenda views.

A possible workaround would be attaching IDs to every headline, then using this ID (instead of a marker) to jump to a headline e.g. to change its TODO state.

  1. ~find~
  2. ~xargs~
  3. ~sha1sum~

These should be installed by default on most Linux / Unix distros.

When updating a single file, ~(buffer-hash)~ is fast enough.

To speed up recursively searching directories for =.org= files and calculating their hashes, the ~find~ and ~sha1sum~ shell commands are used instead of ~directory-files-recursively~ and ~buffer-hash~.

Updating the cache once it has been initialized / loaded from disk takes around 200ms.

In the long term, it would be nice to reuse as much of the existing code as possible and figure out where the bugs are.