jgm / pandoc-citeproc

Library and executable for using citeproc with pandoc
BSD 3-Clause "New" or "Revised" License
291 stars 61 forks source link

Parse from biblatex to CSL `references` #435

Closed bwiernik closed 4 years ago

bwiernik commented 4 years ago

I want to include a variable in a .bib file that gets mapped to the CSL variable references (used for the history of legal cases and also things like retraction information). Is that possible?

jgm commented 4 years ago

Currently nothing maps to that field from bibtex. I'm open to suggestions about what would make sense.

bwiernik commented 4 years ago

For standard BibLaTeX fields, addendum is probably most appropriate.

Given the widespread use of BibLaTeX as a storage format for data being formatted with citeproc-pandoc, perhaps a reasonable approach would be to accept valid CSL variables without clear matches in BibTeX/BibLaTeX that are provided in the .bib, such as references, archive, archive-place, archive_collection, authority, director, etc.?

njbart commented 4 years ago

I don’t think pandoc-citeproc should accept undocumented BibLaTeX (“BL”) fields as input, nor should it introduce arbitrary novel mappings such as BL addendum to CSL references (this one in particular would also amount to a non-backward-compatible change).

Encouraging the use of unofficial and undocumented BL fields will often lead to confusion and disappointment, as BL field names that are not officially documented cannot be used on the BL side anyway without customised BL styles, or if users rename or otherwise tweak non-standard fields to something the standard BL styles understand, e.g., via the \DeclareSourcemap mechanism.

If there is a feeling that additional BL fields are needed for the sole purpose of having pandoc-citeproc map these to specific CSL variables that do not have any official counterpart in BL, I think it would be best to label such BL fields explicitly, e.g. with a prefix:

E.g., what should result in a CSL references variable would have to be provided as a BL csl_references field. Other, similar variables should be prefixed with csl_, too.

This would clarify, at a glance, that (a) the BL field in question is not an official BL field (so no output should be expected with BL engines), and (b) when converted to CSL, an official CSL variable can be expected as the result of the conversion (plus, usually, sensible output with a CSL processor).

jgm commented 4 years ago

Agreed, if we do support anything here, it should be via explicit csl_* fields. Still, I'm a bit unclear why we should support conversions from nonstandard BibLaTeX. Why not keep your bibliography in CSL on pandoc YAML format if you need CSL-specific fields?

njbart commented 4 years ago

Still, I'm a bit unclear why we should support conversions from nonstandard BibLaTeX. Why not keep your bibliography in CSL on pandoc YAML format if you need CSL-specific fields?

Absolutely. All I’m trying to point out is that if anything is going to be implemented here at all, certain mistakes should be avoided.

bwiernik commented 4 years ago

I agree that it would be best to store data in CSL JSON or YAML. That said, it's really common for users to always default to storing reference information in .bib, even when unnecessary (because they are using citeproc-pandoc for formatting). For example, the very widely-used R packages RefManageR and citr both convert references to bibtex, so a lot of RMarkdown users end up (unknowingly) going roundabout to bibtex and back when working with pandoc via R.

njbart commented 4 years ago

Right, but that’s a very different question. The R crowd in particular seem to be, on average, relatively uninformed about anything but bibtex, and, at least in one case, rather ill-motivated to do anything about it. My PR for improving the inaccurate and incomplete citation section of the bookdown manual, e.g., has been left sitting idly for months.

… the very widely-used R packages RefManageR and citr both convert references to bibtex …

Have you tried discussing this with the package authors?

crsh commented 4 years ago

With respect to citr, there are discussions in https://github.com/crsh/citr/issues/55 and https://github.com/crsh/citr/issues/59. I'll add support for JSON. I currently use BibTeX because it works with pandoc-citeproc but also with biblatex or natbib, which some users prefer. In this sense, it's a format that's applicable to a wider set of usecases that I have come across.

ghost commented 4 years ago

Dear all,

I explore the use of pandoc to produce public documents for work. We need PDFs and at best HTML (and epub would be a nice to have). As we cite many different legal documents, we use footnote style with the biblatex style oscola. A reference looks like:

@legislation{gdpr,
  langid = {english},
  title = {{Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such Data, and Repealing Directive 95/46/EC (General Data Protection Regulation)}},
  url = {http://data.europa.eu/eli/reg/2016/679/oj/eng},
  shorttitle = {GDPR},
  number = {2016/119},
  journaltitle = {OJ},
  urldate = {2018-01-07},
  date = {2016-05-04},
  pages = {1},
  series = {L},
  issue = 119,
  type = {regulation},
  pagination = {article},
  keywords = {eu},
}

@jurisdiction{ecj:c-311/18-agop,
  keywords     = {eu},
  title        = {{Data Protection Commissioner v Facebook Ireland Limited, Maximillian Schrems}},
  date         = {2018},
  reporter     = {OJ},
  series       = {C},
  volume       = {249},
  pages        = {21},
  number       = {C-311/18},
  institution  = {ECJ},
  ecli = {EU:C:2019:1145},
  note = {Opinion of AG Saugmandsgaard Øe},
}

The challenge for me is now to produce similar (at best identical) footnotes for HTML and epub. There is an oscola.csl style in CSL format, but this one seems to require different field names.

The same footnote generate with [@ecj:c-311/18-agop, paras 1 and 204] gives

  1. in pdf with biblatex-oscola:\ Case C-311/18 Data Protection Commissioner v Facebook Ireland Limited, Maximillian Schrems EU:C:2019:1145, [2018] OJ C249/21 (Opinion of AG Saugmandsgaard Øe) paras 1 and 204.
  2. in html with oscola.csl:\ Data protection commissioner v facebook ireland limited, maximillian schrems (2018) 249 21 [1] and 204.

I would be interested in having more control over CSL fields from biblatex. If you know a quick work around, let me know, too. :)

bwiernik commented 4 years ago

@rriemann-eu See here for how to enter data in CSL-JSON or CSL YAML to work with oscola.csl: https://www.zotero.org/groups/229950/oscola_samples

Given the complexities of OSCOLA (and legal Citation generally), it’s not going to be possible to automate conversion from biblatex to csl-Json in a reliable way due to biblatex’s limited fields. I recommend storing your data in csl-json following the example items in the library I linked to and then using pandoc and oscola.csl to format your references for all output formats, even pdf.

ghost commented 4 years ago

Thanks for the info. I actually checked out this library with samples earlier on. While the oscola standard has a dedicated section on various EU documents (decisions, reports, court cases), there are no examples in the library. I'll consider to raise the issue with the zotero/csl community.

bwiernik commented 4 years ago

Sure post on the Zotero forums and we can help figure out data entry conventions.