andras-simonyi / citeproc-el

A CSL 1.0.2 Citation Processor for Emacs.
GNU General Public License v3.0
86 stars 9 forks source link

Add support for biblatex-like multi-bibs #17

Closed bdarcus closed 3 years ago

bdarcus commented 3 years ago

I don't know if you want to do this (I'm not sure how much work it is, for example), but it's something that came up in discussions of org-cite, and has come up in CSL discussions before.

From Nicolas, explaining the support for this in oc-biblatex:

Bibliography is printed using "\printbibliography" command. Additional options may be passed to it through a property list attached to the "print_bibliography" keyword. E.g.,

+print_bibliography: :section 2 :heading subbibliography

Values including spaces must be surrounded with double quotes. If you need to use a key multiple times, you can separate its values with commas, but without any space in-between:

+print_bibliography: :keyword abc,xyz :title "Primary Sources"

So ideally, the same would be compatible in citeproc.

Primary use cases that I'm aware of:

  1. multiple sections in one bibliography (the example above, where you filter by keyword, type, etc.; primary sources vs everything else, legal documents vs everything else, discographies vs everything else)
  2. a bibliography per chapter, etc.

cc @denismaier

denismaier commented 3 years ago

This would be a great addition, but it may be tricky with regards to disambiguation and/or subsequent author treatment.

bdarcus commented 3 years ago

The related question I raised on list (see print_bibliograohy options" thread I link to above) is if there's anything Nicolas can do to avoid the current problem, which is documents including this feature will produce duplicate bibliographies with oc-csl.

bdarcus commented 3 years ago

The related question I raised on list (see print_bibliograohy options" thread I link to above) is if there's anything Nicolas can do to avoid the current problem, which is documents including this feature will produce duplicate bibliographies with oc-csl.

Maybe citeproc-el should, if it doesn't support this feature, just have an option to not print duplicate bibliographies?

andras-simonyi commented 3 years ago

Thanks for raising this issue! It seems to me that the two use cases are very different from the point of view of citeproc-el.

The "separate bibliographies per chapter" use case, if I understand it correctly, would simply require generating separate and totally independent bibliographies for certain subsets of the references in a text. This can be done with citeproc-el as it is: just feed these subsets separately and retrieve the references and bibliographies. So I'd argue this could be implemented entirely by the application using citeproc-el, e.g., Org.

The "multiple sections in one bibliography" use case, in contrast, would probably require some dedicated support inside citeproc-el, so it'd be nice if we could come up with a more detailed description what is needed exactly. Would it be way off to say that it requires

Implementing something along these lines doesn't seem difficult.

bdarcus commented 3 years ago

It seems to me that the two use cases are very different from the point of view of citeproc-el.

Yes.

The "separate bibliographies per chapter" use case, if I understand it correctly, would simply require generating separate and totally independent bibliographies for certain subsets of the references in a text.

Yes.

This can be done with citeproc-el as it is: just feed these subsets separately and retrieve the references and bibliographies. So I'd argue this could be implemented entirely by the application using citeproc-el, e.g., Org.

One of us (maybe better you?) should mention this on the list, along with a suggestion of what, if anything, to do with extra #+print_bibliography statements.

Edit: I don't see it a big issue if not implemented initially. I just wanted to flag this as something to consider.

The "multiple sections in one bibliography" use case, in contrast, would probably require some dedicated support inside citeproc-el, so it'd be nice if we could come up with a more detailed description what is needed exactly.

@denismaier - you know much more about how this is implemented in biblatex. Thoughts?

denismaier commented 3 years ago

Would it be way off to say that it requires

* providing a way of partitioning the referenced items into disjoint and ordered subsets (the simplest solution would be using the values of the CSL keyword variable), and

* generating the complete bibliography as usual, but, as an additional final step in sorting, also doing a stable sort according to which subset items belong, and

* returning the sections corresponding to the subsets separately, in the specified order.

That sounds quite reasonable. Just a couple of minor remarks:

bdarcus commented 3 years ago

Rather than splitting one formatted list, then, aren't you first grouping and then sorting each group?

denismaier commented 3 years ago

Rather than splitting one formatted list, then, aren't you first grouping and then sorting each group?

Sort of. But each sub-bibliography must be in the same global context to get disambiguation right.

bdarcus commented 3 years ago

Which aspect of disambiguation? You mean 2019a and such?

So I guess as you add sections, you add another level for such disambiguation; global vs sectional?

denismaier commented 3 years ago

Yes, 2019a 2019b etc Regarding levels of disambiguation: that could work, but I don't know about implementation details. You could also treat each as a global bibliography for disambiguation, then remove unwanted entries, and finally do the formatting.

denismaier commented 3 years ago

Thinking about it a bit more I have the impression that these two, apparently different, cases can be treated in a similar way:

  1. Allow multiple bibliographies
  2. provide a way to include or exclude items

So, the third requirement ("returning the sections corresponding to the subsets separately, in the specified order") is actually not needed after all. (Doing it that way may be slightly less performant than constructing one global bibliography first, then the individual sections in a second step, especially if there are many sections, but I doubt this will be relevant in practice. Don't know if there are other downsides to this approach.)

Edit: possible problem with that approach: citeproc-el currently returns also a list of citations. How is that related to multiple bibliographies?

bdarcus commented 3 years ago

What about this?

The related question I raised on list (see print_bibliograohy options" thread I link to above) is if there's anything Nicolas can do to avoid the current problem, which is documents including this feature will produce duplicate bibliographies with oc-csl (or any other export processor that does not have support for this feature).

denismaier commented 3 years ago

Anything new here?

andras-simonyi commented 3 years ago

Anything new here?

Currently I'm trying to stabilize the already implemented feature set, especially BibLaTeX support, but implementing some kind of query-based item retrieval mechanism is on my ToDo list, and would make it possible to implement both \nocite support and this functionality, I think.

As for solving the "double bibliography" problem, this is definitely an Org-mode issue, citeproc-el doesn't really have much to do with it.

denismaier commented 3 years ago

Currently I'm trying to stabilize the already implemented feature set, especially BibLaTeX support, but implementing some kind of query-based item retrieval mechanism is on my ToDo list, and would make it possible to implement both \nocite support and this functionality, I think.

Ok, sounds good!

andras-simonyi commented 3 years ago

BTW, I've looked into existing solutions (especially biblatex) and there are some important gotchas: I still think that the multibib scenario cannot be simply taken as rendering multiple filtered "views" of the same global bibliography independently because of the global disambiguation context: if John Doe's /Magnum Opus/ from 1888 is explicitly cited then the rendering of the corresponding bibliography item can be different depending on the presence or absence of other John Doe works from 1888 in the other rendered filtered bibliographies. (Assuming, of course, that it's not guaranteed that all items are rendered in one of the bibliographies, but AFAICS there is no such guarantee in biblatex.)

denismaier commented 3 years ago

BTW, I've looked into existing solutions (especially biblatex) and there are some important gotchas: I still think that the multibib scenario cannot be simply taken as rendering multiple filtered "views" of the same global bibliography independently because of the global disambiguation context: if John Doe's /Magnum Opus/ from 1888 is explicitly cited then the rendering of the corresponding bibliography item can be different depending on the presence or absence of other John Doe works from 1888 in the other rendered filtered bibliographies. (Assuming, of course, that it's not guaranteed that all items are rendered in one of the bibliographies, but AFAICS there is no such guarantee in biblatex.)

Yes, that's indeed the tricky part. There may be cases where disambiguation is strictly local, as with per chapter bibliographies, while in other cases disambiguation needs to be global. I think the difference between biblatex's refsections and refsegments is particularly instructive here. And even with global disambiguation, there's still the "substitute recurring authors with three dashes".