rstudio / rmarkdown

Dynamic Documents for R
https://rmarkdown.rstudio.com
GNU General Public License v3.0
2.88k stars 979 forks source link

pandoc: openFile: does not exist (No such file or directory) #1268

Open harryprince opened 6 years ago

harryprince commented 6 years ago

I am trying to server Rmd using shiny server. However, it seems rmarkdown or pandoc is buggy. It works well on Rstudio but fails deploy on rstudio connect too.

yaml config:

output:
  flexdashboard::flex_dashboard:
    social: menu
    source: embed
    storyboard: yes

Rmarkdown version:

 packageVersion("rmarkdown")
[1] ‘1.8’
shiny-server --version
Shiny Server v1.5.3.838
Node.js v6.10.0

pandoc version

/opt/shiny-server/ext/pandoc/pandoc --version
pandoc 1.12.3
Compiled with texmath 0.6.6, highlighting-kate 0.5.6.

sessionInfo():

R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

here is the shiny server log:

processing file: index.Rmd
output file: /tmp/rmarkdown/e01d3e5bbe48ef364e37eb0580046563/index.knit.md

pandoc: /tmp/RtmpWH829n/rmarkdown-str67a94810a85d.html: openFile: does not exist (No such file or directory)
Warning: Error in : pandoc document conversion failed with error 1
Stack trace (innermost first):
    115: pandoc_convert
    114: convert
    113: <Anonymous>
    112: do.call
    111: contextFunc
    110: .getReactiveEnvironment()$runWith
    109: shiny::maskReactiveContext
    108: <reactive>
     95: doc
     94: shiny::renderUI
     93: func
     92: force
     91: withVisible
     90: withCallingHandlers
     89: globals$domain$wrapSync
     88: promises::with_promise_domain
     87: captureStackTraces
     83: tryCatch
     82: do
     81: hybrid_chain
     80: origRenderFunc
     79: output$__reactivedoc__
      3: <Anonymous>
      2: do.call
      1: rmarkdown::run
cderv commented 3 years ago

@jmarshallnz @kiwifb Thanks a lot that is very helpful ! With your comment, I believe we have something to reproduce. It require to setup an environment with UNC path, so it won't be straightforward for me.

Also, we still don't know for sure if this is Pandoc or rmarkdown. It is probably both.

@jabranham the commit you linked to (jgm/pandoc@26ed7fb) is in a release version since Pandoc 2.11. When you say

2.14.1 has fixed the ".Rmd file located in the user roaming profile (i.e. UNC path) issue"

Do you have another fix in Pandoc in mind ? FWIW, next RStudio IDE version should be shipped with 2.14.1

@kiwifb The error you get is clearly not from Pandoc and it different. If you can reproduce, can you share also the traceback() after the error ? What abs_path() does is calling file.exists(). With rmarkdown, everything happens by default relatively to the input path, included intermediates. I believe mainly because those intermediates needs to be found by Pandoc and we would need to deal with absolute path everywhere including in the generated md file. Note that some resources are created in a tmp folder like special HTML headers to include.

We'll try to make quicker progress with your information, thanks for sharing ! We definitely need to reproduce this in order to better dig into the issue with file managements and computation with network drive.

Thanks again

jmarshallnz commented 3 years ago

@cderv Yes, you're right about that commit being in 2.11. Strange. There definitely is a difference in behaviour between pandoc 2.11 and 2.14.1 with RStudio 1.4. With v2.11 I can reproduce (everytime) the fault where the .Rmd is stored in a UNC path it fails to build. With 2.14.1 this is gone, except for the case where rmarkdown is also installed on a UNC path.

I can't really see anything obvious in pandoc other than that commit, so not sure what fixed (part of) the problem.

cderv commented 3 years ago

Then it is possible that something in RStudio IDE helped fix this issue. There was also non-pandoc issue with RStudio IDE and UNC path I believe.

With what you say, it could mean that next version of RStudio IDE with Pandoc 2.14.1 should be better.

We would need to look then more at the issue with rmarkdown package on network drive. For this, I am thinking we may need to copy the resources included into the package locally so that Pandoc does not need to access them through network drive (if that is the issue). Currently, some resource are passed to pandoc using absolute path to rmarkdown installation folder.

kiwifb commented 3 years ago

My setup is a rstudio server running on ubuntu 20.04 with some remote folders mounted via cifs. The previous error I think had some extra layers of indirections because the filename had space in it, which lead to extra normalization. The first time it knits OK

output file: simulating-the-Roy-Rubin-Modelfinal.knit.md

/usr/lib/rstudio-server/bin/pandoc/pandoc +RTS -K512m -RTS simulating-the-Roy-Rubin-Modelfinal.knit.md --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash --output simulating-the-Roy-Rubin-Modelfinal.html --lua-filter /usr/lib/R/library/rmarkdown/rmarkdown/lua/pagebreak.lua --lua-filter /usr/lib/R/library/rmarkdown/rmarkdown/lua/latex-div.lua --self-contained --variable bs3=TRUE --standalone --section-divs --template /usr/lib/R/library/rmarkdown/rmd/h/default.html --highlight-style tango --variable theme=united --include-in-header /tmp/RtmpaGRfIM/rmarkdown-str14a4f44e57f07c.html --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' 

Output created: simulating-the-Roy-Rubin-Modelfinal.html

But the second time (or sometimes more time, it is a bit finicky) I now get

label: unnamed-chunk-28 (with options) 
List of 1
 $ echo: logi TRUE

  |......................................................................| 100%
  ordinary text without R code

Error in file(con, "w") : cannot open the connection
Calls: <Anonymous> -> <Anonymous> -> write_utf8 -> writeLines -> file
In addition: Warning message:
In file(con, "w") :
  cannot open file 'simulating-the-Roy-Rubin-Modelfinal.knit.md': No such file or directory
Execution halted

I am doing everything from the rstudio interface at the moment, how do I run things to get to the backtrace you want. I can run R directly from the command line if needed.

kiwifb commented 3 years ago

Figured it out by running rmarkdown::render from rstudio's console.

5: file(con, "w") 
4: writeLines(enc2utf8(text), con, ..., useBytes = TRUE)
3: write_utf8(res, output)
2: knitr::knit(knit_input, knit_output, envir = envir, quiet = quiet) 
1: rmarkdown::render("/home/frb15/Pdrive/simulating-the-Roy-Rubin-Modelfinal.Rmd",         "html_document")
cderv commented 3 years ago

That is a bit disturbing to me that this is sometimes working, and sometimes not.

Thanks for sharing the setup. I'll see if I can get my hand of such configuration.

kiwifb commented 3 years ago

I think there is a lot to be done around some remote file system. What personally gets my attention as a programer in other language and context is that the intermediary file (in .knit.md) shouldn't be created there. It is transient and disappear once you are done. On most OS there are special place for those in /tmp (or windows equivalent) or ~/.cache or other special folders if you are more sophisticated. There is no absolute guarantee of locality but it is way cleaner than leaving your stuff next to your input and output - unless you are debugging.

cderv commented 3 years ago

I understand you point, and I think what you describe applies to more standard temporary content. However, for the specific usage here, we need to consider the way Pandoc is working. For example, it is highly possible that .knit.md will contain references to resource relatively to your Rmd document. If the resulting md file is elsewhere, those resources linked in the md doc won't be located relatively to the source file anymore and Pandoc won't find them during the conversion to HTML. Some processing would need to happen to handle absolute vs relative path correctly. That is just an example. Also, rmarkdown does write to tempdir() already some files that does not have this type of concerns.

We are a bit off topic here. You can open a new issue if you want to discuss some ways to improve that (Knitr + Pandoc + LaTeX interaction needs to be considered here). Thanks !

kiwifb commented 3 years ago

Ha, it just proves that I am completely ignorant of the context and that I don't know what I am talking about. I have been spending a few hours looking at the code earlier today and I couldn't really untangle the various abstractions. R is not the language I am the most familiar with.

I have come to that conversation because I am supporting a lecturer for whom I have organised the particular setup. The cifs mount is to access our university shared file system for convenience. As it turns out it is not currently convenient to run a number of things including knitting directly on it :)

cderv commented 3 years ago

Thanks for chiming in then !

We really appreciate having also external contributors with different background to share views and thoughts on the ecosystem because it brings new perspective. If you still want to look deeper with your setup (and I think that would us), you could try without R and rmarkdown, using directly Pandoc to convert a document with different scenarios. For example :

If this does not work well at this level, then it is why we have a hard time looking at it from our level 🤔

cacsfre commented 3 years ago

Just in case: I was getting this error with a learnr tutorial (shiny_prerendered). It was working locally but it kept throwing this error when running through docker. I kept trying to figure what was wrong with the Dockerfile but later realized the problem was due to the index.html file being copied into the Docker version of the tutorial. The index.html file referenced a third file (footer.html) using the wrong location, which was being copied from the locally run tutorial.

Solution: don't track/copy index.html but let rmarkdown::run or rmarkdown::render generate it each time.

EPriske commented 3 years ago

Following @pauvasquezh, @NilsOle or @goldingn were you able to solve this?

I solved it by changing my RStudio global options and setting the R version to be the local C:\ drive and not R on the network drive.

This solved my problem with R Markdown. Thank you so much