opensafely-core / r-docker

Docker image for running R scripts in OpenSAFELY
1 stars 3 forks source link

pandoc upgrade #51

Open wjchulme opened 2 years ago

wjchulme commented 2 years ago

I'm using Rmarkdown to create html reports. These include citations to journal articles, using a bibliography. When I render the html locally using my own R installation (with pandoc version 2.11.4), everything works. When I try to render through the opensafely CLI (with pandoc version 1.19.2.4), it fails, with the following error message:

/usr/bin/pandoc +RTS -K512m -RTS draft-manuscript.utf8.md --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash --output /workspace/output/report/draft-manuscript.html --email-obfuscation none --self-contained --standalone --section-divs --template /usr/local/lib/R/site-library/rmarkdown/rmd/h/default.html --no-highlight --variable highlightjs=1 --variable 'theme:bootstrap' --include-in-header /tmp/RtmpSd0tDk/rmarkdown-str852cf4586.html --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --filter pandoc-citeproc
     pandoc: Error running filter pandoc-citeproc:
     Could not find executable 'pandoc-citeproc'.
     Error: pandoc document conversion failed with error 83
     Backtrace:
     1: stop("pandoc document conversion failed with error ", result,
     2: pandoc_convert(input_files, pandoc_to, output_format$pandoc$from,
     3: convert(output_file, run_citeproc)
     4: rmarkdown::render("analysis/report/draft-manuscript.Rmd", knit_root_dir = "/workspace",

When I remove the citations, it works again.

Here's a similar issue https://github.com/crsh/papaja/issues/427#issuecomment-728805035, in particular:

I'm pretty sure this is a problem caused by the fact that the new pandoc version replaces pandoc-citeproc with citeproc

I'm hoping that an update of pandoc will therefore fix the problem, and not break other things 🤞🏻

remlapmot commented 2 years ago

Small footnote: the pandoc executable is bundled within your RStudio installation (it's in RStudio/bin/pandoc/) and not your R installation.

I would also recommend this container included the same version of pandoc that RStudio includes, because the main maintainer of the knitr package is an RStudio employee and so they'll be checking against that rather than the very latest version of pandoc.

StevenMaude commented 2 years ago

This Docker image is currently using Ubuntu 18.04. Even if it was using 20.04, 20.04 doesn't have a new enough version of pandoc either, unsurprisingly.

So, I think would need the appropriate version of pandoc (2.11 or later) including from pandoc's releases. Also think RMarkdown will need updating too.

However, the build process for this Docker image does seem a little unconventional right now.

remlapmot commented 2 years ago

It could be helpful to follow what the Rocker containers do. Their pandoc installation script is here.

StevenMaude commented 2 years ago

It could be helpful to follow what the Rocker containers do. Their pandoc installation script is here.

They're pulling the latest release from pandoc on GitHub, by default :+1: