Kungsgeten / org-brain

Org-mode wiki + concept-mapping
MIT License
1.72k stars 102 forks source link

caching keywords #350

Open swhalemwo opened 3 years ago

swhalemwo commented 3 years ago

Hi, thanks again for this project, it's been very helpful in managing my literature. However, as I've been adding more notes, I've noticed some slowdown. Digging into the code a bit, seems to me that substantial speed improvement can be gained by caching org-brain keywords. At the moment, org-elements-map is used each time to get an entry's keywords. Especially if an entry links to many (hundreds) other entries, getting all the keywords can take (me) several seconds. It is much faster (for me) to only use org-elements-map only once to cache the keywords the first time an entry's keywords are requested, and from then on use the cache for quicker keyword access. I've hacked together an approach (on my fork) that works well for me (I've been using it for some time now), perhaps it can be integrated into org-brain in some form.

Cheers!

Kungsgeten commented 3 years ago

Seems like a good idea. I could probably do something similar as org-brain-headline-cache. I had a look at your fork, but couldn't quite understand it (not your fault, I'm just bad at reading lisp code).

swhalemwo commented 3 years ago

Hi, I'm not sure how my idea is similar to org-brain-headline-cache since I don't know that code too well (it didn't catch my attention when I was looking for places to shave off time, which probably means it's already fairly optimized), but I think it's a similar in that it avoids repeatedly parsing the same file and hence improves speed when navigating org-brain and selecting entries for batch processing.

Specifically, I've renamed org-brain-keywords to org-brain-keywords-get so that no existing functions that call org-brain-keywords have to be changed. The new org-brain-keywords function checks if an entry is cached; if it is, it returns its (cached) keywords, otherwise it calls org-brain-keywords-get (the previous org-brain-keywords) to get the keywords (by parsing the entry file with org-element-map) and add it to the cache (org-brain-keywords-cache is a simple alist with entries as keys and keywords as values, i guess it could be changed into e.g. a hashtable like org-brain-headline-cache). The keywords of a cached entry are updated when its file is saved by adding org-brain-keywords-cache-update to the after-save-hook (atm this assumes that there are no subdirectories in org-brain-path, which is the case for my setup but might need to be generalized). Hope that clarifies my code a bit :)