tmalsburg / helm-bibtex

Search and manage bibliographies in Emacs
GNU General Public License v2.0
462 stars 74 forks source link

Control "do-not-find-pdf" argument with a variable #350

Open ashiklom opened 3 years ago

ashiklom commented 3 years ago

For moderately large libraries (mine is ~3 MB) with lots of PDF files, the bibtex-completion-find-pdf call to bibtex-completion-prepare-entry becomes a significant bottleneck (see profile under "Details"). It would be nice to be able to disable it with a variable such as bibtex-completion-do-not-find-pdf. An easy modification would be to modify these lines:

https://github.com/tmalsburg/helm-bibtex/blob/master/bibtex-completion.el#L824-L826

...to something like this:

           (entry (if (and (not do-not-find-pdf) (not bibtex-completion-do-not-find-pdf) (bibtex-completion-find-pdf entry))
                      (cons (cons "=has-pdf=" bibtex-completion-pdf-symbol) entry)
                    entry))

The default for bibtex-completion-do-not-find-pdf would be nil to preserve existing behavior.

``` - bibtex-completion-candidates 1713 67% - apply 1713 67% - # 1713 67% - bibtex-completion-parse-bibliography 957 37% - bibtex-completion-prepare-entry 786 30% - bibtex-completion-find-pdf 689 27% - bibtex-completion-find-pdf-in-library 689 27% - -first 681 26% f-file? 681 26% + bibtex-completion-get-value 3 0% + f-join 3 0% + s-concat 2 0% + mapcar 88 3% + bibtex-completion-remove-duplicated-fields 8 0% member-ignore-case 1 0% + parsebib-read-entry 168 6% parsebib-find-next-item 1 0% + insert-file-contents 576 22% ```
tmalsburg commented 3 years ago

Hey, thanks for sending the profiling data. Yes, the code for finding PDFs is in desperate need to be rewritten. Ideally there would be a plug-in infrastructure that people can use to mix and mash various ways of locating PDFs according to their needs. This would address a whole bunch of issues here. This way, if you're not referencing PDFs in some particular way, you'd just switch off that plugin and things would become faster as a result. Currently, the code tries all methods for locating PDFs no matter what. But some algorithms for locating PDFs could also be made more efficient. I just looked at the code for finding PDFs with the name BibTeX-key.pdf and see that it's touching the filesystem much more than necessary. The time spent in f-file? can be cut at least in half. This line is the worst offender I think. So in sum, I think there are better opportunities to shave off some processing time than to introduce new customization variables as a stop-gap solutions. I will try to make some improvements next week (but if you'd like to have a shot at it, you'd be most welcome to submit a PR).

The 576ms for insert-file-contents is a big chunk but I'm afraid we won't get rid of it. That's just Emacs being slow when reading the bibliography file. (Although it may help to force this buffer to fundamental mode if it's not in that mode yet.)

Other than that I highly recommend switching to Emacs' new native-comp branch. On my system, helm-bibtex is about 3 times faster with native compilation.