emacs-citar / citar

Emacs package to quickly find and act on bibliographic references, and edit org, markdown, and latex academic documents.
GNU General Public License v3.0
516 stars 55 forks source link

add caching of bib candidates #69

Closed bdarcus closed 3 years ago

bdarcus commented 3 years ago

Thanks for this nice package! I gave it a try and it worked smoothly for me.

I have a relatively large BibTeX file with over 1,000 entries (and 10,000 lines) and I have been using ivy-bibtex. The bibtex-ctions takes a noticeable time to read/load the BibTeX file every time, while ivy-bibtex takes a noticeable time only the first time (assuming that the BibTeX file has not been changed). I guess it is because ivy-bibtex somehow caches the BibTeX file that has been read the first time. Thus, IMHO, it would be great if bibtex-actions contains a similar caching mechanism.

Originally posted by @wenjie2wang in https://github.com/bdarcus/bibtex-actions/discussions/68

bdarcus commented 3 years ago

@wenjie2wang can you please see here?

https://github.com/bdarcus/bibtex-actions/pull/70#issuecomment-817148742

It should be working here.

wenjie2wang commented 3 years ago

Thanks for your quick response! FYI, my bib file is available at https://gitlab.com/wenjie2wang/bibrary/-/raw/master/bib/index.bib.

bdarcus commented 3 years ago

OK, thanks.

I think I actually know what's going on. Will check in a bit.

bdarcus commented 3 years ago

I just loaded your file, and every time I load a command, there's a pause, but it's certainly less than a second.

What do you mean by "noticeable time"?

bdarcus commented 3 years ago

That said, I do think we've identified a performance bottleneck, that when fixed should address this.

wenjie2wang commented 3 years ago

I just loaded your file, and every time I load a command, there's a pause, but it's certainly less than a second.

What do you mean by "noticeable time"?

Thanks, @bdarcus ! This is what I mean by "noticeable time". There is always a noticeable pause after M-x bibtex-actions-open. In contrast, M-x ivy-bibtex is able to respond instantly after the first run. I made a gif for your reference.
Peek 2021-04-10 12-23

bdarcus commented 3 years ago

OK, thanks much! That's what I see.

Not terrible, but I'd like performance to be consistently very good to excellent.

If you see the linked PR, I've figured out what causes this, but not yet why, or what can be done about it.

bdarcus commented 3 years ago

I've been digging into this (doing benchmarking and profiling), and I can't see an obvious way to improve this on my end without unnecessary complexity. But I do see that a simple change in bibtex-completion could address this.

The situation

We have two functions that generate this UI:

  1. bibtex-completion-candidates: this one is cached, and is fast
  2. bibtex-actions--get-candidates: this one is not cached, and is slower, because it transform 1 to do the display formatting

Because of how completing-read is designed, I need 2, though, because it doesn't have any notion of display transformers as in helm and ivy. So there is no distinction in completing-read between searching/filtering and display.

So part of the problem is intrinsic to completing-read.

The other part is that bibtex-completion-candidates is designed around the above feature that is in helm and ivy, but not in completing-read.

Proposed solution

Ideally, however, I'd be able to replace the candidate strings in bibtex-completion-candidates, with the output of bibtex-actions--get-candidates. That would allow all of the display data to be cached, and so address the pause you note.

@tmalsburg - would it be possible to add a hook to bibtex-completion (a la the sort of thing you see in org-mode) that would allow me to specify a different function to generate that string? Something like:

(add-hook 'bibtex-completion-candidates-hook #'bibtex-actions--format-candidate)

If yes, that would be a simple and elegant solution to this issue.

Edit: per below, however, I decided to add my own cache.

Conclusion

In the end, I don't think the slight delay is a deal breaker, but would ideally like to address it, without myself having to write my own caching code.

PS - I have tried different ways to speed up bibtex-actions--get-candidates, but none have an appreciable difference, so I don't think that's a promising path. It's basically fast enough, even for very large libraries, so long as we can activate caching on it.

bdarcus commented 3 years ago

I've opened a linked pr that adds a cache. A hook could still be useful, but if this works, it's a small enough addition that I'm happy to merge it.