Open Pablo-Leon opened 7 years ago
@Pablo-Leon I've just tried to reproduce this. Testing environment was macOS and not Cygwin on Windows. I saw the exact same output as you for the first and second runs. However, Lostfile.csv
is still present in the temp directory.
Could you perhaps upgrade your rmarkdown/rstudio to the latest and try this again?
Hi @rich-iannone , I have rmarkdown 1.10 and the same problem happens to me. When the .rmd document reads a file from my computer and the option clean = TRUE is used, then rmarkdown::render() ends up deleting the file I've read the data from.
Let me know if there is any diagnostic that I can send you to help fixing this bug. Thanks.
This is still an odd behavior with last version of the tools. I used this code to test
dir.create(tmp_dir <- tempfile())
owd <- setwd(tmp_dir)
dir.create("src")
xfun::in_dir(
"src",
xfun::download_file(
"https://github.com/rstudio/rmarkdown/files/1125122/Test_LostFile.Rmd.txt",
"Test.Rmd")
)
dir.create("temp")
fs::dir_tree()
# FIRST RUN
rmarkdown::render('src/test.Rmd',
output_file='Test_LostFile.html',
clean = TRUE,
run_pandoc=FALSE,
output_dir='reps',
intermediates_dir='tmp.xxx')
# File is there
fs::dir_tree(recurse = TRUE)
# SECOND RUN
rmarkdown::render('src/test.Rmd',
output_file='Test_LostFile.html',
clean = TRUE,
run_pandoc=FALSE,
output_dir='reps',
intermediates_dir='tmp.xxx')
# File is deleted
fs::dir_tree(recurse = TRUE)
list.files(recursive = TRUE, include.dirs = TRUE)
setwd(owd)
unlink(tmp_dir, recursive = TRUE)
I am not quite sure why this happens only on second runs, however the file is removed because it is found as part of the intermediates files when this runs https://github.com/rstudio/rmarkdown/blob/8e2ea3ce0626bc9aa20a009d1e1c288da15af78a/R/render.R#L511-L518
This is triggered only when an intermediate dir is set, but it seems it finds resources outside and may not behave as expected. More details on the behavior.
The html_document_base
intermediate generator will find the CSV file at some point as find_external_resources()
will find it.
Tested with find_external_resources("src/test.Rmd")
after the first run.
What happens is Rmd file will be purled to detect external ressources https://github.com/rstudio/rmarkdown/blob/0af6b3556adf6e393b2da23c66c695724ea7bd2d/R/html_resources.R#L362-L365 Using a static analysis of quoted string to check if they could be relative filepath https://github.com/rstudio/rmarkdown/blob/0af6b3556adf6e393b2da23c66c695724ea7bd2d/R/html_resources.R#L386-L389
On first pass, the CSV file does not exist before knitting so it is not found https://github.com/rstudio/rmarkdown/blob/0af6b3556adf6e393b2da23c66c695724ea7bd2d/R/html_resources.R#L75-L79 On second pass it exists, so it will be found and added to intermediates.
I believe the issue rely in the fact that the found resource should be copied but it is not https://github.com/rstudio/rmarkdown/blob/0af6b3556adf6e393b2da23c66c695724ea7bd2d/R/html_resources.R#L404-L414
But copy_file_with_dir()
will run file.copy
like this
file.copy(
"C:/Users/chris/AppData/Local/Temp/RtmpiO8b6u/file49002e53caf/src/../temp/LostFile.csv",
"C:/Users/chris/AppData/Local/Temp/RtmpiO8b6u/file49002e53caf/tmp.xxx/../temp/LostFile.csv"
)
which is the same path considering the folder tree in the example. This dest
file will be added to the intermediates and then removed when clean = TRUE
It seems like a weird bug with how paths are handled, and also due to the folder structure of the example. If I put the CSV file at the same level at the Rmd file, this will not happen
so using in the Rmd
file <- "LostFile.csv"
which then will be correctly found and copied to the intermediate dir. dest
file will this one in intermediates
which are removed.
C:/Users/chris/AppData/Local/Temp/RtmpiO8b6u/file49002e53caf/tmp.xxx/LostFile.csv
I believe this is an issue with relative file path using ...
where the generated path for copy is not the right one.
And... another paths issue in the mix.
The intermediates cleaning mecanism is deleting a datafile read by the program.
The Rmd: Test_LostFile.Rmd.txt
Rmd Code
````markdown --- title: "Test_LostFile" author: "plr" date: "5 de julio de 2017" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) library(readr) ``` ```{r } file <- "../temp/LostFile.csv" dfX <- data.frame( l=letters, n=1:length(letters)) write.csv(dfX, file, row.names = FALSE) dfY <- read_delim( file ,delim="," ,col_names=TRUE ,col_type= cols( l = col_character() ,n = col_double() )) ``` ````With this command line:
Under cygwin on Win7.
First run
On the first run the program works fine and throw this output ('cause run_pandoc=FALSE):
The second time:
output file: C:/BitSync/INE/Ensayo2016/Adapt2Censo/tmp.xxx/Test_LostFile.knit.md
And the file get lost: $ ls -ltr temp/LostFile.csv ls: cannot access 'temp/LostFile.csv': No such file or directory
The problem seems to be the the combination of the ways :
The work around was to use rprojroot package to form an absolute path relative to project base. In this manner the misidentification of the file is avoided.
regards