rstudio / rmarkdown

Dynamic Documents for R
https://rmarkdown.rstudio.com
GNU General Public License v3.0
2.86k stars 969 forks source link

Pandoc args conditional to version should be set in pre_processor #2513

Open cderv opened 1 year ago

cderv commented 1 year ago

What we describe in https://bookdown.org/yihui/rmarkdown-cookbook/install-pandoc.html to run rmarkdown::find_pandoc() in setup chunk does not work in some case.

My pandoc version is

rmarkdown::pandoc_version()
# [1] ‘3.1.7’

But now I am trying to use an old one.

---
title: test
output: 
  html_document:
    self_contained: true
---

```{r}
rmarkdown::find_pandoc(cache = FALSE, dir = dirname(pandoc::pandoc_bin("2.7.3")))

This will throw an error 

==> rmarkdown::render('C:/Users/chris/Documents/DEV_R/rmarkdown/test.Rmd', encoding = 'UTF-8'); ++ Activating rlang global_entrace

processing file: test.Rmd

"C:/Users/chris/AppData/Local/r-pandoc/r-pandoc/2.7.3/pandoc" +RTS -K512m -RTS test.knit.md --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash --output test.html --lua-filter "C:\Users\chris\AppData\Local\R\win-library\4.3\rmarkdown\rmarkdown\lua\pagebreak.lua" --lua-filter "C:\Users\chris\AppData\Local\R\win-library\4.3\rmarkdown\rmarkdown\lua\latex-div.lua" --embed-resources --standalone --variable bs3=TRUE --section-divs --template "C:\Users\chris\AppData\Local\R\win-library\4.3\rmarkdown\rmd\h\default.html" --no-highlight --variable highlightjs=1 --variable theme=bootstrap --mathjax --variable "mathjax-url=https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" --include-in-header "C:\Users\chris\AppData\Local\Temp\RtmpKmpkq0\rmarkdown-str7ab8779d3ac2.html" output file: test.knit.md

Unknown option --embed-resources. Try pandoc.exe --help for more information. Error: ! pandoc document conversion failed with error 2 Backtrace: ▆

  1. └─rmarkdown::render(...)
  2. └─rmarkdown (local) convert(output_file, run_citeproc)
  3. └─rmarkdown (local) convert_it(output)
  4. └─rmarkdown (local) convert_fun(...)
  5. └─rmarkdown:::stop2(...) Exécution arrêtée

As we can see above Pandoc 2.7.3 is correctly use but --embed-resources flag is still used. Though we have https://github.com/rstudio/rmarkdown/blob/8d2d9b844406cbefed31835815964c542f7ff1aa/R/pandoc.R#L846-L849

but this is evaluated when format function are evaluated https://github.com/rstudio/rmarkdown/blob/8d2d9b844406cbefed31835815964c542f7ff1aa/R/html_document_base.R#L42-L48

https://github.com/rstudio/rmarkdown/blob/8d2d9b844406cbefed31835815964c542f7ff1aa/R/render.R#L474-L484

So it is too soon for the change of pandoc version to happen in the setup chunk. That is why rmarkdown::pandoc_available("2.19") returns TRUE (on a system where 3.1.2 is available)

So we should

The second case would be breaking change. The first seems fine to me.

Weirdly, no one reported. So should not be that often. But I hit that when debugging pagedown and this when changing pandoc version is useful.

cderv commented 1 year ago

Doing this change could be breaking, if any other package is looking into pandoc_args value of the output format.

rmarkdown::html_document()$pandoc$args
#>  [1] "--embed-resources"                                                                     
#>  [2] "--standalone"                                                                          
#>  [3] "--variable"                                                                            
#>  [4] "bs3=TRUE"                                                                              
#>  [5] "--section-divs"                                                                        
#>  [6] "--template"                                                                            
#>  [7] "C:\\Users\\chris\\AppData\\Local\\R\\win-library\\4.3\\rmarkdown\\rmd\\h\\default.html"
#>  [8] "--no-highlight"                                                                        
#>  [9] "--variable"                                                                            
#> [10] "highlightjs=1"

If we move some of the args into pre_processor then this would not be here, because pre_processor is modifying args later.

It seems that it is common practice to do this: https://github.com/search?q=%22%24pandoc%24args%22+language%3AR&type=code&l=R

Even distill does it https://github.com/rstudio/distill/blob/ac5e3bf1dca2054a5bf61cfe81b59d7bdb5e3705/R/distill_website.R#L183-L204

But can't find on github any other occurrence of this on Github search.

A different approach could to revisit this: https://bookdown.org/yihui/rmarkdown-cookbook/install-pandoc.html

The problem is when changing the pandoc version from within knitr chunk itself. In a way, this is a bit too late to do that here. Doing it in R console, before rendering works as expected.

pandoc::with_pandoc_version("2.7.3", rmarkdown::render("test.Rmd"))

From within the document, following the logic of changing pandoc version before rmarkdown happen, we could also use custom knit function

---
title: test
output: 
  html_document:
    self_contained: true
knit: ( \(input, ...) pandoc::with_pandoc_version("2.7.3", rmarkdown::render(input)) )
---

# Header

I could even provide this type of function in the Pandoc package to be used that way.

knit: pandoc::render_with_pandoc_version

So we could also either document it differently, for one of those solution.

or propose a new mechanism to allow changing the pandoc version used for rmarkdown, like a specific YAML option (and Suggest usage of pandoc package behind the scene).

I believe we could also detect any change of Pandoc version during rmarkdown::render() to error or warn that this is not advice

Anyhow, there are several ways it seems, and not sure that moving every conditional args definition is possible - it could be too late for doing that in rmarkdown

yihui commented 1 year ago

I guess it should not be common to change the Pandoc version dynamically inside Rmd. Supporting this is indeed tricky. If the goal is to be able to test different versions of Pandoc, it seems pandoc::with_pandoc_version() has done a perfect job. I'm not sure if it's worth the effort to support the Knit button working with a particular version of Pandoc. If that really needs to be done, I tend to use a top-level YAML field like pandoc_version, and let render() read it. That might be a little easier for users than providing a custom knit field, but the latter is totally fine. As I said, this use case may not be common, so we don't really have to consider which implementation is easier for users.

atusy commented 1 year ago

I agree with yihui.

cderv commented 1 year ago

Thanks for you feedback. This confirms my intuitions. I'll handle that with a new function in pandoc package (https://github.com/cderv/pandoc/issues/37)

believe we could also detect any change of Pandoc version during rmarkdown::render() to error or warn that this is not advice

And in rmarkdown I will do that so that users are aware this is not adviced to change Pandoc version from a code chunk, maybe using xfun::do_once() to only warn once per session when this is detected.

lfmcmillan commented 6 months ago

Please would you update the Rmarkdown bookdown site (https://bookdown.org/yihui/rmarkdown-cookbook/install-pandoc.html) to reflect the latest advice on this. Just now on a work machine I tried setting the pandoc version and directory to use a different version of pandoc and found that it wasn't enough to fix the error; I had to also set RSTUDIO_PANDOC to the other version to get the .Rmd file to knit correctly. That is, when I ran find_pandoc from the console pointing at the other directory that worked, but find_pandoc inside the directory still kept failing even when I had the code chunk telling it to point elsewhere. This was my failing .Rmd that knitted after I changed RSTUDIO_PANDOC:

title: "Rmarkdown demo"
author: "Louise"
date: '2024-02-29'
output: html_document
---

```{r, setup, include=FALSE}
rmarkdown::find_pandoc(version="3.1.12.1",dir="C:/Users/mcmilllo/AppData/Local/r-pandoc/r-pandoc/3.1.12.1/")

A````

cderv commented 6 months ago

Thanks for the reminder @lfmcmillan - I added an update