quarto-dev / quarto-cli

Open-source scientific and technical publishing system built on Pandoc.
https://quarto.org
Other
3.98k stars 328 forks source link

Cite Method: biblatex cannot handle underscores in filename of .bib file #10027

Open Mavoort opened 5 months ago

Mavoort commented 5 months ago

Bug description

If you write an article that contains citations and use the biblatex engine instead of the default one (citeproc), Quarto cannot handle underscores in the filename of the .bib file.

Steps to reproduce

Content of documend.qmd:

---
format:
  pdf:
    cite-method: biblatex
    keep-latex: true
bibliography: quarto_references.bib
---

Quarto [@Allaire_Quarto_2024] is really awesome!

Content of quarto_references.bib:

@software{Allaire_Quarto_2024,
        author = {Allaire, J.J. and Teague, Charles and Scheidegger, Carlos and Xie, Yihui and Dervieux, Christophe},
        doi = {10.5281/zenodo.5960048},
        month = feb,
        title = {{Quarto}},
        url = {https://github.com/quarto-dev/quarto-cli},
        version = {1.4},
        year = {2024}
}

Expected behavior

Quarto should render the document to a pdf-file and use biber to create correct citations:

2024-06-15-173450_1501x356_scrot

Actual behavior

Quarto apparently runs biber on quarto\_references.bib (notice the backslash), which doesn't exist. This causes biber to fail with an error message:

generating bibliography
  INFO - This is Biber 2.19
  INFO - Logfile is 'document.blg'
  INFO - Reading 'document.bcf'
  INFO - Found 1 citekeys in bib section 0
  INFO - Processing section 0
  INFO - Looking for bibtex file 'quarto\_references.bib' for section 0
  ERROR - Cannot find 'quarto\_references.bib'!
  INFO - ERRORS: 1

Your environment

Quarto check output

Quarto 1.4.554
[✓] Checking versions of quarto binary dependencies...
      Pandoc version 3.1.11: OK
      Dart Sass version 1.69.5: OK
      Deno version 1.37.2: OK
[✓] Checking versions of quarto dependencies......OK
[✓] Checking Quarto installation......OK
      Version: 1.4.554
      Path: /opt/quarto/bin

[✓] Checking tools....................OK
      TinyTeX: (not installed)
      Chromium: (not installed)

[✓] Checking LaTeX....................OK
      Using: Installation From Path
      Path: /usr/bin
      Version: 2023

[✓] Checking basic markdown render....OK

[✓] Checking Python 3 installation....OK
      Version: 3.11.6
      Path: /usr/bin/python3
      Jupyter: 5.3.1
      Kernels: julia-1.10, python3, sagemath

[✓] Checking Jupyter engine render....OK

[✓] Checking R installation...........OK
      Version: 4.3.1
      Path: /usr/lib/R
      LibPaths:
        - /home/marcel/R/x86_64-pc-linux-gnu-library/4.3
        - /usr/local/lib/R/site-library
        - /usr/lib/R/site-library
        - /usr/lib/R/library
      knitr: 1.43
      rmarkdown: 2.23

[✓] Checking Knitr engine render......OK
Mavoort commented 5 months ago

If you look at the generated LaTeX-file document.tex, the problem is in line 127:

126 \usepackage[]{biblatex}
127 \addbibresource{quarto\_references.bib}

If you remove the underscore there and then run xelatex+biber+xelatex again

xelatex --interaction=nonstopmode document.tex
biber document

everything works fine.

mcanouil commented 5 months ago

That's a LaTeX limitation I'm afraid. LaTeX "does not like" underscores and needs to escape them which in some situations leads to crash or weird side effects. I'm not sure there is anything that can be done here.

Note that keep-latex does not exist, the correct option is keep-tex.

Mavoort commented 5 months ago

Hm, the option keep-latex works fine for me.

Thanks for you answer. However, I still think this has nothing to do with LaTeX.

LaTeX "doesn't like" underscores because it uses them to typeset subscript: H_2 O --> H₂O . That's why Quarto has to escape them in normal text, meaning it converts text with _ underscore in the .qmd file to text with \_ underscore in the generated LaTeX file.

But there are exceptions to this, which are filenames. For example, Quarto converts

![description](image_with_underscore.png)

to

\includegraphics{image_with_underscore.png}

in the generated .tex file (notice: no backslashes). Otherwise, it would not be possible to include images with an underscore in the filename. The same is true for hyperlinks with \href.

This means Quarto must have an internal filter somewhere: Escape all underscores with a backslash, except in filenames of hyperlinks and images.

What I am suggesting is to add bibtex files to that exception. That way, everything would work. You can see this the following way:

  1. run quarto render document.qmd --> generates a LaTeX-file, document.tex
  2. edit document.tex manually to remove the backslash at
    \addbibresource{quarto\_references.bib}
  3. run XeLaTeX: xelatex --interaction=nonstopmode document.tex
  4. run biber: biber document
  5. run XeLaTeX again

This works without problems, generating a pdf-file with the desired citations. There are no error messages in XeLaTeX or biber. For this reason I think the problem lies with Quarto and not with LaTeX.

mcanouil commented 5 months ago

Hm, the option keep-latex works fine for me.

Try with a document that does not crash. It can't work because that option does not exist in the codebase.


The issue is Pandoc. Quarto does not handle citations, Pandoc does.

quarto pandoc -s --bibliography=refer_ences.bib --biblatex -o index.tex index.md

Will lead to \addbibresource{refer\_ences.bib}.

It's a problem with LaTeX not handling _ everywhere the same way. In some places, they need to be escaped and in others, they don't. Here Pandoc escaped it while it should not.

This needs to be reported upstream (https://github.com/jgm/pandoc).

FYI, I did not see a report for bibliography file but another one for another part:

Mavoort commented 5 months ago

I think you're right. I did not consider that the problem might be with Pandoc. I'll open an issue there.

cderv commented 5 months ago

Related discussion

and historically Pandoc has this issue reported, but it seems it was not a Pandoc issue directly 🤔

It is possible they missed something and that \addbibresource does not allow escaped underscore, which is what Pandoc templating system is adding.

njbart commented 5 months ago

bibliography: '`quarto_references.bib`{=latex}' (i.e., specifying that the string is raw latex, based on a suggestion made in https://github.com/jgm/pandoc/issues/9262#issuecomment-1859241645) seems to work for the example given in the OP.

cderv commented 5 months ago

That is a good solution !

So I would say solutions should be

Thanks