tmalsburg / helm-bibtex

Search and manage bibliographies in Emacs
GNU General Public License v2.0
462 stars 74 forks source link

allow for more formats (other than APA) when converting entries to strings #235

Open timmli opened 6 years ago

timmli commented 6 years ago

I guess this is a feature request.

Right now, helm-bibtex only lets you stringify an entry in APA style when choosing "Insert reference". It would great to be able to choose from a larger set of styles, or even to easily define your own style (something that JabRef allows for).

I also noticed that helm-bibtex always treats fields as obligatory, so that I have to post-process the reference string now and then. Optional fields like editors in proceedings should be printed only when they exist.

Thanks for this excellent package!

tmalsburg commented 6 years ago

I'd be happy to include this capability but will probably not be able to work on this myself. If someone wants to give it a shot (should be easy actually) and prepare a PR, I'd be happy to provide guidance.

jagrg commented 6 years ago

If you use org-ref, you can customise org-ref-formatted-citation-formats to define your own style (or look there for inspiration). I'd have to think about point 2. Maybe add a few more replace-regexp-in-string functions at the end of the bibtex-completion-apa-format-reference function to clean the string (remove missing page numbers etc.). WDYT?

tmalsburg commented 6 years ago

Ah, I overlooked the second point. True, the current formatting code is not very flexible. However, the APA specification is ridiculously complex and it's actually not trivial to fully cover it. Easy improvements are certainly possible, though.

timmli commented 6 years ago

I just wanted to let you know that I'm playing around with a more general approach that can process something like the following specification:

(defvar bibs-styles-alist
  '(("langsci" .
     (("incollection" .
       (concat
        (bibs-field entry 'author "${author}." '((maxbibnames . 99) 
                                                 (name-order . first-last)
                                                 (last-and . " & "))
        (bibs-field entry 'year " ${year}.")
        (bibs-field entry 'title " ${title}." '((case . sentence)))
        (bibs-field entry 'booktitle " In")
        (bibs-field entry 'editor " ${editor}" '((maxbibnames . 99)
                                                 (name-order . first-last)
                                                 (last-and . " & ")))
        (bibs-field-ifplural entry 'editor " (eds.)" " (ed.)")
        (bibs-field entry 'editor ",")
        (bibs-field entry 'booktitle " ${booktitle}" '((case . sentence)))
        (bibs-field entry 'series " (${series})")
        (bibs-field entry 'pages ", ${pages}")
        (bibs-field entry 'booktitle ".")
        (bibs-field entry 'publisher " ${publisher}")
        (bibs-field entry 'address ": ${address}")
        )))))
  "A two-dimensional alist of the form (STYLE . (TYPE . FORMAT))")

bibs-field (entry field string &optional formatters) returns a formatted string of FIELD in ENTRY according to STRING using the formatting restrictions in FORMATTERS.

tmalsburg commented 6 years ago

Wow, this looks like it could be really powerful and convenient. Have you checked whether something similar already exists in related Emacs packages (ebib, reftex, bibtex.el, ...)?

timmli commented 6 years ago

I've looked into those packages to some extend (not reftex though). Nothing similar found so far.

tmalsburg commented 6 years ago

I'm not aware of anything, but it's always surprising how much already exists in the Emacs ecosystem. Org-ref may be another thing worth checking.

jagrg commented 6 years ago

Another idea, if you need specific and more complex styles, is to use citeproc. With citeproc-org installed, you can try the function below to create formatted bibliographies of one or more (marked) candidates. It uses chicago-author-date.csl by default (cc: @andras-simonyi).

(defun helm-bibtex-insert-citeproc-reference (_candidate)
  (with-temp-buffer
    (cl-loop for candidate in (helm-marked-candidates)
         do (insert (concat bibtex-completion-cite-default-command ":" candidate " ")))
    (insert "\n\n")
    (let ((org-export-before-parsing-hook '(citeproc-org-render-references))
      (citeproc-org-org-bib-header ""))
      (org-export-to-buffer 'ascii "*Org ASCII Export*" nil nil nil t nil)
      (save-excursion
    (re-search-forward "^$" nil t)
    (delete-region (point-min) (point))
    (delete-lines 1)))))

(helm-add-action-to-source "Insert citeproc reference" 'helm-bibtex-insert-citeproc-reference helm-source-bibtex N)
andras-simonyi commented 6 years ago

Hello and thanks to @jagrg for bringing this up and cc-ing me! Another alternative could be to use the citeproc-el library directly along the following lines:

(defvar helm-bibtex-csl-style "/path/to/csl-file")

(defvar helm-bibtex-csl-locale-dir "/path/to/csl-locales-dir")

(defun helm-bibtex-insert-citeproc-reference (_candidate)
  (let* ((locale-getter (citeproc-locale-getter-from-dir helm-bibtex-csl-locale-dir))
         (item-getter (citeproc-itemgetter-from-bibtex helm-bibtex-bibliography))
         (proc (citeproc-create helm-bibtex-csl-style item-getter locale-getter))
         (cites (mapcar (lambda (x) (citeproc-citation-create :cites `(((id . ,x)))))
                        (helm-marked-candidates))))
    (citeproc-append-citations cites proc)
    (insert (car (citeproc-render-bib proc 'plain)))))

(helm-add-action-to-source "Insert citeproc reference" 'helm-bibtex-insert-citeproc-reference helm-source-bibtex)

This would insert a plain text bibliography, but citeproc-el also supports org-mode, html and latex as output formats, so the format could be adapted to the mode of the current buffer.

jagrg commented 6 years ago

Thanks, this is going to be very useful. So I suppose with this we can use any citation style? Very nice. There's one problem with your function though, citeproc-itemgetter-from-bibtex fails if helm-bibtex-bibliography is a list of files.

andras-simonyi commented 6 years ago

Yes, any independent CSL style -- dependent styles, which are only aliases to other styles are not (yet) supported. (All the styles in the root of the linked repo are independent ones.) As for the case when helm-bibtex-bibliography is a list of bibtex files, I've just committed some changes to the master of citeproc-el to deal with this, the new version should be available in MELPA when the next build is finished.

jagrg commented 6 years ago

Thanks, it's working. Could you also explain how to configure FORMATTING-PARAMETERS in citeproc-render-bib?

andras-simonyi commented 6 years ago

FORMATTING-PARAMETERS is an alist which is returned together with the formatted bibliography, it contains additional formatting information provided by the used CSL style (line spacing etc.) which can be used by the integrating program, e.g. via style sheets. If you want to change some aspects of the returned bibliography (the car of the returned pair) and none of the existig formatters are suitable then you can define and register a new one -- see citeproc-formatters.el for examples. I don't know what you want to change, but it might be enough to change the bib slot of an existing formatter. (Of course, I'm open to changes to the built-in formatters as well, I'm sure they can be improved upon.)

jagrg commented 6 years ago

Sorry, I'm still confused. Using your function above, I want to add a new line between entries. It's not clear where this alist of additional formatting information goes.

andras-simonyi commented 6 years ago

I see -- unfortunately, currently the easiest way to do that would be to define a new formatter, since the built-in plain formatter has a hard-coded single "\n" separator between references. Actually, when I wrote the formatter code I was torn between this solution and returning simply a list of references, so I can switch to that if it was helpful for you. Then you could simply mapconcat the returned list of references using your preferred separator. (As for the alist, it is unfortunately not an argument of the function citeproc-render-bib, it is part (the cdr) of its return value. I'll try to make this clearer in the documentation, because I can see that it can be very confusing.)

jagrg commented 6 years ago

I think it makes sense to add a new line between entries (that's the case for the org formatter, correct?), or at least give users the option to use something else.

andras-simonyi commented 6 years ago

Fair enough, thanks for the input. I've switched the default separator to "\n\n" for now (in master) and will give some thought to the problem of making this more flexible (without forcing the user into defining a new formatter). -- Update: I've done so, see commit https://github.com/andras-simonyi/citeproc-el/commit/af005f1c9b4cdda3b6f85f8d81207a4e0528a48d .

tmalsburg commented 6 years ago

Sorry for chiming in so late. Citeproc looks really impressive and the ability to produce citations in almost any format would be a fantastic addition. I'm not familiar with citeproc. Could someone please give me a quick summary of what we'd need to change in helm/ivy-bibtex to make use of citeproc's capabilities? Thanks!

jagrg commented 6 years ago

A lot of the helm-bibtex/org-ref code uses KEY and ENTRY quite heavily, so I'd say the first step is probably to be able to pass KEY and/or ENTRY from helm-bibtex to citeproc. As far as helm-bibtex goes, we could change the "Insert reference" action. Other than that, I see citeproc working with a bibliography:file.bib link to generate formatted bibliographies using the export dispatcher.

andras-simonyi commented 6 years ago

Unfortunately, I'm not familiar with the helm-bibtex codebase but now I've had a cursory look and indeed the "Insert reference" action could call citeproc-el to generate and insert a formatted bibliography from the selected items. It seems that the most efficient way of doing this would be to pass all (KEY, ENTRY) pairs for the selected items (in a citeproc itemgetter closure) since this information seems to be already available (no need for citeproc-el to parse the bibliography files). As for your remark

Other than that, I see citeproc working with a bibliography:file.bib link to generate formatted bibliographies using the export dispatcher.

Could you elaborate on that? Does helm-bibtex have functionality related to org-mode export? Or are you referring to some other type of export?

jagrg commented 6 years ago

I'm referring to the Org-mode exporter (C-c C-e ...). If we add a new link type (see org-link-set-parameters) that uses citeproc in the export parameter, could we manage to produce the bibliography using the export dispatcher?

It seems that the most efficient way of doing this would be to pass all (KEY, ENTRY) pairs for the selected items (in a citeproc itemgetter closure) since this information seems to be already available (no need for citeproc-el to parse the bibliography files).

Fantastic. I'd love to see this added to helm-bibtex.

tmalsburg commented 6 years ago

Unfortunately, I do not currently have the time to work on this. Too much going on at work. However, feels like preparing a PR, I'd be happy to give input/feedback on it. Otherwise, we'll leave this issue open until I'll find time to do it myself (not before the summer, I'm afraid).

andras-simonyi commented 6 years ago

Sorry for responding only now. As for producing bibliographies from bibliography:file.bib type org-mode links, this should certainly be doable with citeproc-org.

However, feels like preparing a PR, I'd be happy to give input/feedback on it.

Thanks, I plan to give it a try next month, after adding biblatex support to citeproc-el.

jagrg commented 6 years ago

Thanks, Andras. I think we could easily replace all bibtex-completion-apa-* functions with something like the above. We also need a function for adding new notes when all notes are stored in a single file (see bibtex-completion-notes-template-one-file and bibtex-completion-edit-notes).

publicimageltd commented 4 years ago

I'd like to refresh this discussion. I have the problem that I cannot add notes to reference entries which are not books, because they lack author and editor, yet these fields are mandatory for the creation of the title string of the note. Thus I get an error. If I try to create a note, Emacs complains about a nil value where a string was expected; the problem is that nil is passed as a value to split-string in bibtex-completion-apa-get-value.

Quick fix would be to simply allow empty author AND editor fields in the pcase of bibtex-completion-apa-get-value, a better solution would be to allow handling of non book cases by a generic mechanism, not unlike citation.

Thinking about it, another good idea would be to wrap the format string for the notes in a condition-case, so that the workflow is not inhibited in case there is some problem with the field definitoins...

tmalsburg commented 4 years ago

@publicimageltd I think what you're proposing makes sense but it's a different issue than what is discussed in this thread which is about plain-text citations, not note files. (Although I there is some similarity.)

Just to better understand your request, what kind of non-book entries do you have in mind? I'm just wondering because I can't think of any document type that has neither author nor editor. In fact, author or editor information is mandatory for most common BibTeX entry types.

And are you aware that you can configure bibtex-completion-notes-template-one-file and bibtex-completion-notes-template-multiple-files so that they don't use authors or editors? Could that solve your problem?

publicimageltd commented 4 years ago

I use zotero to import website references, which are then automatically exportet to global .bib file via betterbibtex. For wikipedia entries, the result is something like that:

@inreference{Entscheidungstheorie2020,
  title = {Entscheidungstheorie},
  booktitle = {Wikipedia},
  date = {2020-03-25T02:09:59Z},
  url = {https://de.wikipedia.org/w/index.php?title=Entscheidungstheorie&oldid=198083504},
  urldate = {2020-05-15},
  abstract = {Die Entscheidungstheorie ist in der angewandten Wahrscheinlichkeitstheorie ein Zweig zur Evaluation der Konsequenzen von Entscheidungen. Die Entscheidungstheorie wird vielfach als betriebswirtschaftliches Instrument benutzt. Zwei bekannte Methoden sind die einfache Nutzwertanalyse (NWA) und der präzisere Analytic Hierarchy Process (AHP). In diesen Methoden werden Kriterien und Alternativen dargestellt, verglichen und bewertet, um die optimale Lösung einer Entscheidung oder Problemstellung finden zu können.},
  langid = {german},
  note = {Page Version ID: 198083504}
}

Nowadays, it seems quite common to use mere websites as sources, so I think the use case is not that unusual actually.

Regarding bibtex-completion-notes-: I am aware of these variables, and actually changed their values. But in normal cases, the default is quite sensible. So that's another possibility: to associate the template with the entry type, i.e.

((article . "#+TITLE: Notes on: ${author-or-editor} (${year}): ${title}\n\n")
 (inreference . "#+TITLE: Notes on ${title}\n\n")
 (default . "Whatever you like"))
tmalsburg commented 4 years ago

I see. Thanks for providing info on the context. I'll think about possible solutions.