jgm / pandoc

Universal markup converter
https://pandoc.org
Other
33.97k stars 3.34k forks source link

Non-dropping particles in authors' names apparently not properly handled when using .bib files, but properly handled if using a .json #9827

Closed alebg closed 3 months ago

alebg commented 3 months ago

Explanation

When using a .bib bibliography file with citeproc, non-dropping particles in family names of authors seem to be treated differently than when using a csljson bibliography.

In particular, when using the JSON our CSL applies correctly the config we have set up to print author names and sort, but when using the bib file our CSL doesn't do anything. I noticed this because of the discrepancy I was getting between a visual CSL editor (https://editor.citationstyles.org/visualEditor/), to which I upload a csljson, and my local pdfs compiled with pandoc with the corresponding bib file.

We have double brackets around family names with particles (e.g., author = {{van Inwagen}, Peter}, but removing doesn't help.

Minimal example

Original md file, name-particles.md:

---
title: "Test"
bibliography: name-particles.bib
---

## Inline

- parencite: [@vaninwagen:1975]

- parencite: [@lewis_dk-lewis:1970]

# References {#sec:references}

bib file, name-particles.bib:

@article{vaninwagen:1975,author = {{van Inwagen}, Peter},date = {1975},title = {{The incompatibility of Free Will and Determinism}},journal = {{Philosophical Studies}},volume = {27},number = {3},pages = {185--199},doi = {10.1007/bf01624156},kw-level1 = {determinism;},kw-level2 = {free-will;},num-sort = {147057}}
@article{lewis_dk-lewis:1970,author = {Lewis, David and Lewis, Stephanie R.},date = {1970},title = {{Holes}},journal = {{Australasian Journal of Philosophy}},volume = {48},number = {2},pages = {206--212},note = {{reprinted in \citet[3--9]{lewis_dk:1983}}},doi = {10.1080/00048407012341181},kw-level1 = {determination;},kw-level2 = {extrinsic-entities;},kw-level3 = {holes;},note-perso = {in GE},num-sort = {84069}}

if we run pandoc -C -s -o out.tex name-particles.md (or pandoc -C -s -o out.pdf name-particles.md), we get, in the bibliography printed at the end, that van Inwagen appears last.

However, if now we do pandoc --from bibtex --to csljson name-particles.bib and use the resulting json...

[
  {
    "DOI": "10.1007/bf01624156",
    "author": [
      {
        "family": "Inwagen",
        "given": "Peter",
        "non-dropping-particle": "van"
      }
    ],
    "container-title": "Philosophical Studies",
    "id": "vaninwagen:1975",
    "issue": "3",
    "issued": {
      "date-parts": [
        [
          1975
        ]
      ]
    },
    "page": "185-199",
    "title": "The incompatibility of Free Will and Determinism",
    "type": "article-journal",
    "volume": "27"
  },
  {
    "DOI": "10.1080/00048407012341181",
    "author": [
      {
        "family": "Lewis",
        "given": "David"
      },
      {
        "family": "Lewis",
        "given": "Stephanie R."
      }
    ],
    "container-title": "Australasian Journal of Philosophy",
    "id": "lewis_dk-lewis:1970",
    "issue": "2",
    "issued": {
      "date-parts": [
        [
          1970
        ]
      ]
    },
    "note": "reprinted in ",
    "page": "206-212",
    "title": "Holes",
    "type": "article-journal",
    "volume": "48"
  }
]

...as the bibliography for name-particles.md and repeat pandoc -C -s -o out.tex name-particles.md (or pandoc -C -s -o out.pdf name-particles.md), we get van Inwagen sorted in the bibliography before Lewis (what we want).

As we can see, the csljson produced correctly identifies "van" as a non-dropping-particle. This seems to be lost however when using the .bib file.

What is wanted

What we ultimately want is that the bibliography produced in .tex and .pdf files when compiling the original markdown (the one using the .bib file) shows:

We haven't been able to crack down how to do this by playing with the CSL configuration for non-dropping-particles and double brackets, when using citeproc.

Running...

$ pandoc --version
pandoc 3.1.8
Features: +server +lua
Scripting engine: Lua 5.4

from within a Ubuntu 22.04.3 LTS (Jammy Jellyfish) docker container.

jgm commented 3 months ago

Transferring to pandoc, which handles conversion from bib.

jgm commented 3 months ago

Pandoc converts the .bib into a native pandoc reference list, which is used with citeproc. You can see what it converts to by doing:

 % pandoc name-particles.bib -t markdown -s
---
nocite: "[@*]"
references:
- author:
  - family: van Inwagen
    given: Peter
  container-title: Philosophical Studies
  doi: 10.1007/bf01624156
  id: "vaninwagen:1975"
  issue: 3
  issued: 1975
  page: 185-199
  title: "[The incompatibility of Free Will and Determinism]{.nocase}"
  type: article-journal
  volume: 27
---

(I removed the Lewis citation.) If it were

  - family: Inwagen
    given: Peter
    non-dropping-particle: van

then we'd be good.

jgm commented 3 months ago

It works if you remove the {} around van Inwagen and add options={useprefix=true} (useprefix is described in the biblatex manual; bibtex itself has no concept of non-dropping particles, I think).

@article{vaninwagen:1975,author = {van Inwagen, Peter},options={useprefix=true},date = {1975},title = {{The incompatibility of Free Will and Determinism}},journal = {{Philosophical Studies}},volume = {27},number = {3},pages = {185--199},doi = {10.1007/bf01624156},kw-level1 = {determinism;},kw-level2 = {free-will;},num-sort = {147057}}
alebg commented 3 months ago

@jgm thanks a lot for the quick feedback!

using

@article{vaninwagen:1975,author = {van Inwagen, Peter},options={useprefix=true},date = {1975},title = {{The incompatibility of Free Will and Determinism}},journal = {{Philosophical Studies}},volume = {27},number = {3},pages = {185--199},doi = {10.1007/bf01624156},kw-level1 = {determinism;},kw-level2 = {free-will;},num-sort = {147057}}

does improve things. Only one detail missing, with a CSL using demote-non-dropping-particle="display-only" and a macro for contributors (that gets printed first in the bibliography) that looks like this:

  <macro name="contributors">
    <group delimiter=". ">
      <names variable="author">
        <name and="text" name-as-sort-order="all" sort-separator=", " delimiter=", " delimiter-precedes-last="never">
          <name-part name="family" font-variant="small-caps"/>
        </name>
        <label form="short" prefix=", "/>
        <substitute>
          <names variable="editor"/>
          <names variable="translator"/>
          <names variable="director"/>
          <text macro="substitute-title"/>
          <text macro="title"/>
        </substitute>
      </names>
      <text macro="recipient"/>
    </group>
  </macro>

in the visual CSL editor we get the sorting of the bibliography correct (van Inwagen under "I"), and it gets printed in the bibliography as "van Inwagen, Peter". However, with your suggestion we get the ordering right, but the name renders as "Inwagen, Peter van". Any clue on this one? I thought all the issues would go away if we could find a way to tell pandoc which one is a non-dropping particle, but apparently there's something more going on.

jgm commented 3 months ago

However, with your suggestion we get the ordering right, but the name renders as "Inwagen, Peter van".

Does this also happen with the CSLJSON bibliography with non-dropping-particle? If so, then it's a citeproc issue and not an issue of bibtex conversion.

alebg commented 3 months ago

However, with your suggestion we get the ordering right, but the name renders as "Inwagen, Peter van".

Does this also happen with the CSLJSON bibliography with non-dropping-particle? If so, then it's a citeproc issue and not an issue of bibtex conversion.

@jgm I just noted that yes, it also happens when using the .json bibliography (ordering correct, but name renders as "Inwagen, Peter van"). It is the exact same csljson uploaded to the visual CSL editor though, and both my local pandoc and the visual editor have the same csl.

alebg commented 3 months ago

@jgm should I open another issue at citeproc?

jgm commented 3 months ago

Yes, let's close this one and open a new one at citeproc, with the CSL JSON bibliography and actual and expected output.