org-roam / org-roam-bibtex

Org Roam integration with bibliography management software
GNU General Public License v3.0
571 stars 47 forks source link

Performance issue: org-roam database gets repeatadly queried during bibtex-completion parsing #156

Closed MichielCottaar closed 3 years ago

MichielCottaar commented 3 years ago

While parsing the bibliography (bibtex-completion-parse-bibliography) bibtex-completion queries for each entry whether it has a note file (in bibtex-completion-prepare-entry). This leads to orb-find-note-file to be called for each entry in the bibtex file, which can be very slow for large bibtex files due to the database queries in orb-find-note-file.

I managed to fix this in my setup by skipping the note lookup in bibtex-completion-prepare-entry:

(defun bibtex-completion-prepare-entry (entry &optional fields do-not-find-pdf)
  "Prepare ENTRY for display.
ENTRY is an alist representing an entry as returned by
`parsebib-read-entry'.  All the fields not in FIELDS are removed
from ENTRY, with the exception of the \"=type=\" and \"=key=\"
fields.  If FIELDS is empty, all fields are kept.  Also add a
=has-pdf= and/or =has-note= field, if they exist for ENTRY.  If
DO-NOT-FIND-PDF is non-nil, this function does not attempt to
find a PDF file."
  (when entry ; entry may be nil, in which case just return nil
    (let* ((fields (when fields (append fields (list "=type=" "=key=" "=has-pdf=" "=has-note="))))
           ; Check for PDF:
           (entry (if (and (not do-not-find-pdf) (bibtex-completion-find-pdf entry))
                      (cons (cons "=has-pdf=" bibtex-completion-pdf-symbol) entry)
                    entry))
           (entry-key (cdr (assoc "=key=" entry)))
           ; Check for notes:
           ;;(entry (if (cl-some #'identity
           ;;                    (mapcar (lambda (fn)
           ;;                              (funcall fn entry-key))
           ;;                            bibtex-completion-find-note-functions))
           ;;           (cons (cons "=has-note=" bibtex-completion-notes-symbol) entry)
           ;;         entry))
           ; Remove unwanted fields:
           (entry (if fields
                       (--filter (member-ignore-case (car it) fields) entry)
                    entry)))
      ;; Normalize case of entry type:
      (setcdr (assoc "=type=" entry) (downcase (cdr (assoc "=type=" entry))))
      ;; Remove duplicated fields:
      (bibtex-completion-remove-duplicated-fields entry))))
myshevchuk commented 3 years ago

Hi! Thanks a lot for the report! I was always wondering why parsing my master bib file with bibtex-completion was so slow when the raw parsing with parse-bib takes a fraction of a second.

On the ORB's side I can try to optimize the orb-find-note-file function so that it makes a cache and queries it instead of the database.

On your side you can remove unused functions from bibtex-completion-find-note-functions to further speed up things. I could do it in ORB too, but that's not very transparent with respect to the user.