rstudio / rmarkdown

Dynamic Documents for R
https://rmarkdown.rstudio.com
GNU General Public License v3.0
2.87k stars 974 forks source link

PDF rendering failure due to absolute short path name on Windows for figures #2162

Open simonschoe opened 3 years ago

simonschoe commented 3 years ago

Hi there, I encountered the abovementioned issue when running the following code:

---
title: Title Page
author: Author - Organization

output:
  pdf_document:
    latex_engine: pdflatex
    keep_md: yes

knit: source(here::here("src", "_render22.R"))$value
---

```{r setup, include=F, eval=T, warning=F}
# set global chunk options
knitr::opts_chunk$set(
  comment = ">",
  collapse = F,
  echo = F,

  fig.width = 6, fig.asp = 0.618, # golden ratio
  #out.width = "70%",
  #fig.align = "center",
  dev = "cairo_pdf"
)
par(mar = c(4, 4, .2, .1))
plot(cars)
par(mar = c(4, 4, .2, .1))
plot(cars)

And the contents of `_render22.R`:

function(report, ...) {

rmarkdown::render( input = report, output_dir = "outputs", output_file = "test.pdf" )

}


And finally the pandoc error message:

! Undefined control sequence.

C:PATH/figure-latex/chapter_4_plot4-1} \caption{cap}\label{fig:c... Fehler: LaTeX failed to compile C:PATH/outputs/test.tex. See https://yihui.org/tinytex/r/#debugging for debugging tips. See test.log for more info. ``` Strangely, the error does not occur if I use the `knit` button instead of `render()` and it does not occur if I refrain from using `out.width` in the plot's chunk options. Any idea what might be the cause and remedy? Output from `xfun::session_info('rmarkdown')`: ``` R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19043), RStudio 1.4.1103 Locale: LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C LC_TIME=German_Germany.1252 Package version: base64enc_0.1.3 digest_0.6.27 evaluate_0.14 glue_1.4.2 graphics_4.0.2 grDevices_4.0.2 highr_0.9 htmltools_0.5.1.1 jsonlite_1.7.2 knitr_1.33 magrittr_2.0.1 markdown_1.1 methods_4.0.2 mime_0.10 rlang_0.4.11 rmarkdown_2.8 stats_4.0.2 stringi_1.6.2 stringr_1.4.0 tinytex_0.32 tools_4.0.2 utils_4.0.2 xfun_0.23 yaml_2.2.1 Pandoc version: 2.11.2 ```
cderv commented 3 years ago

Unfortunatly I can't reproduce your issue by using the document you provided. I am also on Windows but using R 4.1.0. Not sure it changes anything.

Can you try simplify it and see if it works ?

For example:

---
title: Title Page
author: Author - Organization
output:
  pdf_document:
    latex_engine: pdflatex
    keep_md: yes
    keep_tex: yes
---

```{r chapter_4_plot3, fig.cap="cap", echo=F}
par(mar = c(4, 4, .2, .1))
plot(cars)
par(mar = c(4, 4, .2, .1))
plot(cars)


Does this above works ? 

Also can you add `keep_tex: yes` to be sure the tex file is kept. You can then open it and look for the potential character that does not work. It is possible this is related to something in your path to the file (is `C:PATH/figure-latex/chapter_4_plot4-1` a real path ?)

Thanks
simonschoe commented 3 years ago

@cderv Thank you so much for the help. I think I figured it out. Apparently there was a white space somewhere in PATH (so no it was not a real path, I just wanted to abbreviate the example), could that have been the reason?

However, what I don't quite get: Why does it work seamlessly when hitting knit, but not when using the external render.R script? I would have assumed that both approaches dislike the white space character?! Is there a robust solution to this so that wherever my project lives on my machine it runs smoothly (irrespective of the path to the project)?

cderv commented 3 years ago

Apparently there was a white space somewhere in PATH (so no it was not a real path, I just wanted to abbreviate the example), could that have been the reason?

Yes whitespace in file path could be the issue.

However, what I don't quite get: Why does it work seamlessly when hitting knit, but not when using the external render.R script?

I don't know either. Usually, figure path are relative to the document and in your example error it seems to be an absolute path. You may have encounter this issue: #2024

To really try to understand the difference, it would be great to have an example that reproduce the issue. To have some hints, you can also compare the command run when Knitting with the button, and the command you run.

I would try not to set output_dir and see if this do any difference. You can still move your self the resulting PDF file yourself. It this is related to #2024 this could help.

simonschoe commented 3 years ago

Hi, sorry for the delay, I attached a small reprex including the folder structure that I am currently using. In particular, the folder on the highest level contains a whitespace in its name which triggers the error. Alternatively, commenting out output_dir as well as output_file also does the trick (but none of these alone).

What I find interesting when looking at the generated .tex file is the difference between the path to the plot that has out.width specified in the chunck header and the plot that doesn't (the second misses the file extension):

\begin{figure}
\centering
\includegraphics{C:/Users/.../test test/test-test/outputs/de_report_template_files/figure-latex/chapter_4_plot3-1.pdf}
\caption{cap}
\end{figure}

\begin{figure}
\includegraphics[width=0.7\linewidth]{C:\Users\...\test test\test-test\outputs\de_report_template_files/figure-latex/chapter_4_plot4-1} \caption{cap}\label{fig:chapter_4_plot4}
\end{figure}

Are these information sufficient? Thanks in advance!

Reprex.zip

cderv commented 3 years ago

Hi @simonschoe,

That helps a lot ! Thank you very much.

I think this comes from knitr ways of handling figure environment in special case, and the filepath on Windows not being properly escape or having backslash instead of forward slash. This is a combinaison of things activated that lead to this unexpected behavioir.

Let me explain:

Because of this above you see differences in the filepath. The main difference is not the extension but the / and \ difference. I believe the error is thrown because \ is used - if I replace by / in the tex file and compile to PDF it works.

This is also combined with another issue:

I need to look deeper to see if we can safely fix this or not. Let's just be reminded that:

I look into this ! Thanks for the report and the reproducible example, that helps a looooooottttt!! 😉

cderv commented 3 years ago

This is definitely related to #2024.

The backslashes are added by rmarkdown in pandoc_path_arg() to protect the path so that Pandoc knows how to handle it. This is where the shortpathname transformation happens which introduce the backslashes.

It seems that LaTeX does not like this.

We need to see why we do this transformation for Pandoc, and how we make both Pandoc and LaTeX happy in this case. (if possible)

The current workaround is to not use output_dir - use fs package or standard file manipulation functions from base R to move the file around before and after rendering.

Hopefully we'll be able to fix this for next version.

simonschoe commented 3 years ago

Hey @cderv thank you so much for the context!

I need to look deeper to see if we can safely fix this or not. Let's just be reminded that:

* Spaces in file path are really not a good idea. It is always the source of pain specially on Windows.

* When possible, it is better to move file around and not use `output_dir` or/and `output_file`. The path handling with all the external resources is tricky, especially source of pain when combined with other edge case (like space in path, or special character in path).

The reason I was using output_dir, was to separate the output's location from the script location so that the user of my .Rmd file would only have to execute it (via knit) and then look for the output in a specific output folder. For the same reason, I was experimenting with special folder names (i.e. including white spaces) as the user of my file may store the file somewhere on the local machine while not being necessarily aware of best practice naming conventions. That being said, being able to use white spaces in directory names in conjunction with a separate output folder would make the whole workflow slightly more robust for me respectively the user. Only to give you some context of why this specific setup was relevant for me in the first place 😉