rstudio / rmarkdown

Dynamic Documents for R
https://rmarkdown.rstudio.com
GNU General Public License v3.0
2.87k stars 977 forks source link

rmarkdown::render within function gives error #934

Closed WHMan closed 7 years ago

WHMan commented 7 years ago

Hi,

I have a minor issue concerning a function I wrote containing rmarkdown::render. Specifically when using that function in combination with phyloseq-formatted data in a for-loop (or apply) gives an error while rendering.

Minimal example

Function containing rmarkdown::render, that gives the error:

# Method 1
my_render <- function(input) {
    rmarkdown::render(input = input)
}
my_render(input = "test.Rmd")

rmarkdown::render on its own does not give the error:

# Method 2
input = "test.Rmd"
rmarkdown::render(input = input)

Test example that always works

If test.Rmd contains this for-loop, everything is fine:

test_df <- data.frame(taxa = rep(LETTERS[1:5], 2),
                      count = 1:10,
                      count2 = 11:20)
test_taxa <- unique(test_df$taxa)
test_list <- list()
for (z in test_taxa) {
  test_list[[z]] <- sum(test_df[which(test_df$taxa == z), c("count", "count2")]) / 
    sum(rowSums(test_df[, c("count", "count2")]))
}
test_list

Test example with phyloseq data

However, if test.Rmd wants to loop phyloseq-data:

library("phyloseq")
data(GlobalPatterns)

test2_taxa <- get_taxa_unique(GlobalPatterns, "Phylum")[1:5]
test2_list <- list()
for (x in test2_taxa) {
  test2_list[[x]] <- sum(sample_sums(subset_taxa(GlobalPatterns, Phylum == x))) / 
    sum(sample_sums(GlobalPatterns)) }
test2_list

sessionInfo()

R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] phyloseq_1.13.6

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.8         plyr_1.8.3          XVector_0.8.0       iterators_1.0.8     tools_3.3.2        
 [6] zlibbioc_1.14.0     digest_0.6.9        evaluate_0.10       gtable_0.1.2        nlme_3.1-128       
[11] lattice_0.20-33     mgcv_1.8-15         Matrix_1.2-7.1      foreach_1.4.3       igraph_1.0.1       
[16] yaml_2.1.13         parallel_3.3.2      stringr_1.0.0       knitr_1.15.1        cluster_2.0.3      
[21] Biostrings_2.36.4   S4Vectors_0.6.6     IRanges_2.2.9       multtest_2.24.0     stats4_3.3.2       
[26] rprojroot_1.1       ade4_1.7-3          grid_3.3.2          Biobase_2.28.0      data.table_1.9.6   
[31] survival_2.38-3     rmarkdown_1.3       RJSONIO_1.3-0       ggplot2_2.0.0       reshape2_1.4.1     
[36] magrittr_1.5        MASS_7.3-45         splines_3.3.2       backports_1.0.4     scales_0.3.0       
[41] codetools_0.2-15    htmltools_0.3.5     BiocGenerics_0.14.0 permute_0.9-0       colorspace_1.2-6   
[46] ape_3.4             stringi_1.0-1       munsell_0.4.2       biom_0.3.12         vegan_2.3-3        
[51] chron_2.3-47  

I am not really sure whether this is a rmarkdown or phyloseq issue, though. Any support is appreciated. Thanks!

kevinushey commented 7 years ago

I wasn't able to reproduce this issue, although perhaps I wasn't clear on your instructions. With an R Markdown document containing this:

---
title: "Untitled"
output: html_document
---

```{r}
test_df <- data.frame(taxa = rep(LETTERS[1:5], 2),
          count = 1:10,
          count2 = 11:20)
test_taxa <- unique(test_df$taxa)
test_list <- list()
for (z in test_taxa) {
  test_list[[z]] <- sum(test_df[which(test_df$taxa == z), c("count", "count2")]) / 
sum(rowSums(test_df[, c("count", "count2")]))
}
test_list
```

```{r}
library("phyloseq")
data(GlobalPatterns)

test2_taxa <- get_taxa_unique(GlobalPatterns, "Phylum")[1:5]
test2_list <- list()
for (x in test2_taxa) {
  test2_list[[x]] <- sum(sample_sums(subset_taxa(GlobalPatterns, Phylum == x))) / 
sum(sample_sums(GlobalPatterns)) }
test2_list
```

I was able to successfully render this document with this:

# Method 1
my_render <- function(input) {
  rmarkdown::render(input = input)
}
my_render(input = "~/scratch/test.Rmd")

Perhaps the issue is with phyloseq -- I tested with phyloseq_1.16.2.

> sessionInfo() R version 3.3.2 (2016-10-31) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: macOS Sierra 10.12.2 locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] phyloseq_1.16.2 BiocInstaller_1.22.3 tutor_0.1.0 testthat_1.0.2 rmarkdown_1.3 [6] knitr_1.15.1 roxygen2_5.0.1 devtools_1.12.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.8 ape_4.0 lattice_0.20-34 Biostrings_2.40.2 assertthat_0.1 [6] rprojroot_1.1 digest_0.6.11 foreach_1.4.3 mime_0.5 R6_2.2.0 [11] plyr_1.8.4 backports_1.0.4 stats4_3.3.2 evaluate_0.10 httr_1.2.1 [16] ggplot2_2.2.1 zlibbioc_1.18.0 lazyeval_0.2.0.9000 curl_2.3 data.table_1.10.0 [21] vegan_2.4-1 S4Vectors_0.10.3 Matrix_1.2-7.1 splines_3.3.2 stringr_1.1.0.9000 [26] htmlwidgets_0.8 igraph_1.0.1 munsell_0.4.3 shiny_0.14.2.9001 httpuv_1.3.3 [31] BiocGenerics_0.18.0 multtest_2.28.0 mgcv_1.8-16 htmltools_0.3.5 biomformat_1.0.2 [36] tibble_1.2 IRanges_2.6.1 codetools_0.2-15 permute_0.9-4 crayon_1.3.2 [41] withr_1.0.2 MASS_7.3-45 grid_3.3.2 nlme_3.1-128 jsonlite_1.2 [46] xtable_1.8-2 gtable_0.2.0 git2r_0.18.0 magrittr_1.5 scales_0.4.1 [51] stringi_1.1.2 XVector_0.12.1 reshape2_1.4.2 iterators_1.0.8 tools_3.3.2 [56] ade4_1.7-5 Biobase_2.32.0 markdown_0.7.7 parallel_3.3.2 survival_2.40-1 [61] yaml_2.1.14 colorspace_1.3-2 rhdf5_2.16.0 cluster_2.0.5 memoise_1.0.0.9001
WHMan commented 7 years ago

@kevinushey: Thanks for your response! Sorry for my unclear explanation of the problem, but you interpreted it correctly.

I've updated my phyloseq package to version 1.19.1, but I still get the same error-message, while rendering the R Markdown document with:

# Method 1
my_render <- function(input) {
  rmarkdown::render(input = input)
}
my_render(input = "~/scratch/test.Rmd")
yihui commented 7 years ago

Sounds like an environment issue. How about passing envir = parent.frame() to render() in my_render()?

WHMan commented 7 years ago

Thanks @yihui! That solved my problem. However, I do not really understand why this solves the problem as envir = parent.frame() seems to be default in rmarkdown::render(). It probably comes down to my poor knowledge of environments and lexical scoping in R :laughing:.

yihui commented 7 years ago

I can definitely see why you feel confused. It is related to the delayed evaluation of function arguments. In a recent interview of @jcheng5, he said:

But, if we had a time machine and could go back, the one change that I would make – well the most important change I would make – to R would be to have delayed evaluation be a feature that you opt into rather than being the default. As I said before, delayed evaluation for function arguments is really awesome and it makes things easy in R that are quite unnatural to do in other languages. But, I feel it’s a tool that you usually don’t want to use. When you want it, it’s awesome to have, but it would be nicer to have all function arguments evaluated except for those which have been annotated for lazy evaluation.

Even though render() has a default envir = parent.frame(), this argument won't be evaluated until it is actually used inside render(), and by that time, parent.frame() will point to the frame (environment) outside render(), which means the inside environment of your my_render(). What you actually want is the environment outside my_render().

If you pass an explicit envir = parent.frame() to render(), R knows this parent frame refers to the parent of the current environment: the "current" environment is the internal environment of my_render(), and the parent will be the outside environment of my_render().

That said, whenever someone runs into an issue similarly, I'm relatively sure it is a bug of the third-party package that is used in R Markdown (the package can not work with well objects in parent frames, e.g. it might have assumed objects must be in the global environment). In this case, phyloseq might be the culprit, but I know nothing about it, so I'm not entirely sure.

WHMan commented 7 years ago

Wow, thanks again @yihui! I did not expect (such) an (extensive) answer and really appreciate the effort you put into it. Your explanation is very clear and I will try to look into the assumptions of the phyloseq functions.

soleyjh commented 6 years ago

@yihui Thanks so much man! This answer just saved the day for me also!

github-actions[bot] commented 3 years ago

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.