jkitchin / org-ref

org-mode modules for citations, cross-references, bibliographies in org-mode and useful bibtex tools to go with it.
GNU General Public License v3.0
1.37k stars 243 forks source link

Isolated bibliographies for different projects #1079

Open pwolper opened 1 year ago

pwolper commented 1 year ago

Hi jkitchen and community,

thanks for the amazing work you have done on org-ref. I have been using it for about half a year and like the way it works a lot.

As an undergraduate academic I have a few different bibliographies that I am working on simultaneously (subject and/or projects). Adding all of them to my bibtex-completion-bibliographies and libraries, work on my system for me, but I would love to be able to export or share them with my collegues. This might be something other people can relate to as well. An idea, is sharing org bibliographies (directories for pdf, bibtex, etc) over a version control system like git. I have that setup already for backup purposes. Using a gitclient even lets me use my bibliography on other devices, in case I need a quick glance at a pdf. Using git for orf-ref bibliographies can let other people quickly clone my biblio in case they need it for a shared project. Is there a way to include bibliographies from git into the org-ref system? Maybe we can develop function to use a relative path to the bibtex-completion-library, such as to a ~/biblio included in git projects. Then maybe the [[cite:&name-yyyy-foo-bar]] links would become usable in readmes. I don't know enough about org-ref for that. Being able to share bibliographies between configures emacs setups could be a helpful for collaborative research.

I would love to start with a lisp-function that can select bib-text entries cited in a longer org-file. Similar to the way latex is able to build a reference list from [[printbibliography:]] in org, is there a way to extract the bib entries for cited articles? Such as a final list of references I want to include before publishing my report or repository. This function could match the names of the bibtex entries from my bibtex file and copy the pdfs (potentially notes too) into an accompanying directory. This can also be used to split bibliographies in case they get too large.

Sadly I don't know enough lisp yet to write such a function, maybe it isn't even that difficult. If someone is interested in working on such a feature (or providing some helpful tips on how to attempt this), feel free to contact me at philip.wolper@gmail.com. I'd be very interested in working an some solutions.

Best, Philip

jkitchin commented 1 year ago

I don't think there is a way to integrate org-ref with git this way. The right thing to do is just clone the git repo locally, and use a relative path in your org file, or a directory local variable that points to it.

There are already functions to extract the entries cited in an org-file: one of these does some of what you want. I don't think it copies the pdfs anywhere though.

org-ref-extract-bibtex-to-file

org-ref-extract-bbitex-entries

org-ref-extract-bibtex-blocks

pwolper commented 1 year ago

Thanks, these functions are a start. I should be able to write a lisp function to extract the pdf filenames accordingly.

pwolper commented 1 year ago

Would you have any ideas for 'making cite&:' citations double as links with relative paths to pdf in the git project. So readers of the readme for example, can open the pdfs. I am trying to keep a copy of the biblios I use for any given project together bundled to the project repos.

jkitchin commented 1 year ago

The way to make cite links work to pdfs is to have a file or directory local definition of bibtex-completion-library-path. This variable is where bibtex-completion looks for a pdf.

pwolper commented 1 year ago

These kinds of links work for me. But there probably is no way of doubling the cite&: links as [[foobar]] type of links, as used by readme.org files, or markdown-flavoured links. that way the links could be used on repo pages.

jkitchin commented 1 year ago

They way they are rendered on GH pages is independent of org-mode, I think it is done by some Ruby library.

If you are using emacs to generate the pages, you could write a special backend I think to convert the citations to urls that point to some repo.

pwolper commented 1 year ago

Yeah I was thinking more for the github pages, so that that i guess.

But I wrote a function, that extracts the pdfs from bibtex-completion-library and a prompted biblio.bib file.

;; The function 'extract-cited-pdfs' is an elisp function for extracting the BibTeX keys
;; from a .bib file and extracting the identical PDFs from the bibtex-completion-library
;; directory for pdfs, as set in your org-ref config. It can be used together with the function
;; 'org-ref-extract-bibtex-to-file', to create a .bib file from an org-document.
;; Author: Philip L. Wolper, GitHub: github.com/pwolper

(defun extract-cited-pdfs ()
  "Extract PDF files with names matching the BibTeX keys from a directory.
   The extracted PDF files will be saved in the PDF-DIR."
  (interactive)
  ;; Step 1: Read the bibtex file
  (let* ((bib-file (read-file-name "BibTeX files: "))
         (output-dir (read-directory-name "Output extracted pdfs: "))
         (output-folder (concat output-dir (file-name-nondirectory bib-file) ".pdf/"))
         (pdf-dir (car bibtex-completion-library-path))
         (bib-contents (with-temp-buffer
                         (insert-file-contents bib-file)
                         (buffer-string)))
         (key-regexp "@\\(?:\\w+{\\)?\\([^,]+\\),")
         references '())

  ;; Extract the bibtex keys using regular expressions
  (while (string-match key-regexp bib-contents)
    (let ((key (match-string 1 bib-contents)))
      (setq references (cons key references))
      (setq bib-contents (substring bib-contents (match-end 0)))))

  ;; Debug message to check the value of references
  (message "References: %s" references)

  ;; Create the output folder if it
  (unless (file-exists-p output-folder)
    (make-directory output-folder t))

      ;; Print/Iterate over the cited references
  (let ((pdf-files (directory-files-recursively pdf-dir "\\.pdf$")))
    (message "Pdfs: %s" pdf-files) ;; Print list of pdfs
    (dolist (key references)
        (let ((pdf-file (concat pdf-dir key ".pdf")))
      (when (member pdf-file pdf-files)
              (let ((output-file (concat output-folder key ".pdf")))
          (message "Written %s: "output-file)
          (copy-file pdf-file output-file t t))))))

  (message "PDF extraction complete.")
))

maybe it would be useful along side the org-ref-extract-bibtex-to-file, org-ref-extract-bbitex-entries and org-ref-extract-bibtex-blocks functions.

jkitchin commented 1 year ago

what do you think of https://github.com/jkitchin/org-ref/commit/26c06912c7833104c7b4c7b96b8f200e98067a68#diff-7b61d0f80eb55fdf9a5f5e0ae44ec3f417d96bfab30073f9de771967691cff72R1262?