yihui / knitr

A general-purpose tool for dynamic report generation in R
https://yihui.org/knitr/
2.36k stars 873 forks source link

catch error=TRUE chunks produced by hook_purl() #2338

Open bastistician opened 2 months ago

bastistician commented 2 months ago

When a chunk is evaluated under error=TRUE, knitr catches errors as the code is declared to fail. The hook_purl()-produced R script, however, does not currently account for that.

Here is a minimal example (inspired by tests/testit/knit-handlers.Rmd):

setwd(tempdir())
writeLines(c('```{r test, error=TRUE}', '1 + ""', '```'), "knit-error.Rmd")
library(knitr)
knit_hooks$set(purl = hook_purl)
knit("knit-error.Rmd")  # does *not* fail

Using current knitr 1.46, this produces the following R script (knit-error.R):

## ----test, error=TRUE-----------------------------------------------------------------------------
1 + ""

Running that script fails, of course:

xfun::Rscript("knit-error.R")
#> Error in 1 + "" : non-numeric argument to binary operator
#> Execution halted

I think knitr could take advantage of the new R option "catch.script.errors" introduced in R 4.4.0. For example, for the above test chunk (with a local error=TRUE setting), I could imagine hook_purl() producing

## ----test, error = TRUE----------------------------------------------------------------------------
options(catch.script.errors = TRUE)

1 + ""

options(catch.script.errors = FALSE)

This would enable testing R scripts from vignettes that are shipped with packages (in inst/doc) even if they use error=TRUE chunks.

On a related note, eval=FALSE code chunks don't have this problem as hook_purl() already takes care of commenting such code: https://github.com/yihui/knitr/blob/44a7bee566afff3eb4c8b13fd6fa64487e12981f/R/hooks-extra.R#L150

jeroen commented 2 months ago

This bug makes all packages that have a vignette with an {r error=TRUE} chunk fail when running R CMD check with for example --no-build-vignettes (which invokes the purl functionality).

The problem is covered up by the fact that checking packages with _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_=TRUE or --as-cran avoids purling, and therefore is not observed on CRAN or most CI configurations.

yihui commented 2 months ago

Sorry for the trouble. This change was intentional (see changelog), and CRAN made the request for change, after they changed _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_ to true in R.

It can be difficult to deal with these purl() problems case by case (eval=FALSE, error=TRUE, ...), since there can be many cases. For this particular case of error=TRUE, I think it's relatively simple for me to deal with, so I'll probably just do it (if we don't want to rely on the option catch.script.error in R 4.4.0, perhaps we can wrap the code in try()?).

BTW, you can also set the chunk option purl=FALSE to exclude specific code chunks from the purl() result, if the only goal is to avoid R CMD check errors. I understand that you may want to deliberately show the problematic code in the output script, though.

yihui commented 2 months ago

Correction: _R_CHECK_VIGNETTES_SKIP_RUN_MAYBE_ has been changed to true in R-devel (and will appear in R 4.4.0). Previously I thought the change was only for CRAN and I was wrong.

For R < 4.4.0, this env var has to be set (or use --as-cran for R CMD check).

bastistician commented 2 months ago

To clarify, I see this issue more as a wishlist item than a bug in knitr.

A simpler solution for error=TRUE chunks might be to comment the code in the generated R script just like for eval=FALSE. Still I'm not 100% convinced of my proposal. Not least because vanilla knitr actually defaults to error=TRUE, but it wouldn't make sense to comment everything in the R script "just in case". So this needs to differentiate between a locally set error=TRUE and the global error=TRUE, which makes me feel uncomfortable (I don't know if there is a precedent for such behaviour in knitr).

BTW, you can also set the chunk option purl=FALSE to exclude specific code chunks from the purl() result, if the only goal is to avoid R CMD check errors.

Yes, knitr has long offered several options for the vignette author to deal with these problems on a case-by-case basis. I've used purl=FALSE several times. Personally, I like that installed packages (in their "doc" folders) ship an R script for each vignette, and of course I'd like a comprehensive check to ensure this R script runs without errors.

yihui commented 1 month ago

So this needs to differentiate between a locally set error=TRUE and the global error=TRUE

Yes, that's the problem. The differentiation is only possible for tangle_block(): https://github.com/yihui/knitr/blob/f1788b81ffbe187be54a21fce32395fe0c77f2e5/R/block.R#L586-L587

but not possible for hook_purl() (which is used by the vignette engine to purl code): https://github.com/yihui/knitr/blob/f1788b81ffbe187be54a21fce32395fe0c77f2e5/R/hooks-extra.R#L135

so I'm afraid that purl=FALSE is the only possible way to exclude error=TRUE chunks...