yihui / knitr

A general-purpose tool for dynamic report generation in R
https://yihui.org/knitr/
2.36k stars 873 forks source link

Plot leaks into knitr document #2297

Closed dmurdoch closed 9 months ago

dmurdoch commented 9 months ago

By filing an issue to this repo, I promise that

I reported this as an R issue yesterday on Bugzilla: https://bugs.r-project.org/show_bug.cgi?id=18604, because it affects the "Run examples" link on help pages. It turns out that it's an issue in knitr.

Here are instructions to show the same problem in an R Markdown document:

Save this document as "Untitled.Rmd":

---
title: "Untitled"
author: "Duncan Murdoch"
date: "2023-09-29"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
dev.new()
dev.off()
plot.new()
plot(1:10, 1:10)

In an R session, run this code:

plot(1, main = "Not in the document") rmarkdown::render("Untitled.Rmd")


Examine the resulting `Untitled.html` output file.  The "Not in the document" plot will be shown in the document after the `plot.new()` call.

This doesn't happen if you leave out the opening and closing of a plot device on the first two lines of the code chunk.  (They don't need to be those exact lines; the original code wrote a more extensive plot to a file.)  It also doesn't happen if you just click on Knit in RStudio, because that runs `knitr` in a separate process.

xfun::session_info() R version 4.3.1 (2023-06-16) Platform: x86_64-apple-darwin20 (64-bit) Running under: macOS Monterey 12.7

Locale: en_CA.UTF-8 / en_CA.UTF-8 / en_CA.UTF-8 / C / en_CA.UTF-8 / en_CA.UTF-8

time zone: America/Toronto tzcode source: internal

Package version: askpass_1.2.0 assertthat_0.2.1 base64enc_0.1-3 binman_0.1.3 bit_4.0.5
bit64_4.0.5 bitops_1.0-7 blob_1.2.4 bslib_0.5.1 cachem_1.0.8
caTools_1.18.2 cli_3.6.1 commonmark_1.9.0 compiler_4.3.1 cpp11_0.4.6
crayon_1.5.2 curl_5.0.2 DBI_1.1.3 digest_0.6.33 dplyr_1.1.3
ellipsis_0.3.2 evaluate_0.22 extrafont_0.19 extrafontdb_1.0 fansi_1.0.4
fastmap_1.1.1 filelock_1.0.2 fontawesome_0.5.2 fs_1.6.3 generics_0.1.3
glue_1.6.2 graphics_4.3.1 grDevices_4.3.1 grid_4.3.1 highr_0.10
htmltools_0.5.6 htmlwidgets_1.6.2 httpuv_1.6.11 httr_1.4.7 jquerylib_0.1.4
jsonlite_1.8.7 keyring_1.3.1 knitr_1.44.1 later_1.3.1 lattice_0.21-8
lifecycle_1.0.3 lubridate_1.9.3 magrittr_2.0.3 memoise_2.0.1 methods_4.3.1
mime_0.12 miniUI_0.1.1.1 openssl_2.1.1 pillar_1.9.0 pkgconfig_2.0.3
plogr_0.2.0 processx_3.8.2 promises_1.2.1 ps_1.7.5 qtrade_1.4688
R6_2.5.1 ragg_1.2.5 rappdirs_0.3.3 Rcpp_1.0.11 RCurl_1.98-1.12
rgl_1.2.6 rlang_1.1.1 rmarkdown_2.25 RSelenium_1.7.9 RSQLite_2.3.1
Rttf2pt1_1.3.12 rvest_1.0.3 sass_0.4.7 selectr_0.4.2 semver_0.2.0
shiny_1.7.5 shinyjs_2.1.0 sodium_1.3.0 sourcetools_0.1.7.1 stats_4.3.1
stringi_1.7.12 stringr_1.5.0 sys_3.4.2 systemfonts_1.0.4 testpkg3_0.1.0
textshaping_0.3.6 tibble_3.2.1 tidyselect_1.2.0 timechange_0.2.0 tinytex_0.46
tools_4.3.1 utf8_1.2.3 utils_4.3.1 vctrs_0.6.3 wdman_0.2.6
withr_2.5.1 xfun_0.40 xml2_1.3.5 xtable_1.8-4 yaml_2.3.7
zoo_1.8-12



I can understand that some things leak from the session to the document (e.g. I'd expect variables and `par()` settings to leak if I run `knitr` in the same session), but it seems like a bug that a whole plot to leaks, and is displayed after the `plot.new()` call.
cderv commented 9 months ago

Thanks for the report. I can obviously reproduce.

This remind me of something similar we've dealt with for child documents already

This was fixed by https://github.com/yihui/knitr/commit/ff3d767fb7bb663b0279c3a97497571ad1d761ae and amended by https://github.com/yihui/knitr/commit/58fc726c8989f057b9877eeb78a477fc44517f29

So it is probably related to same part of the code https://github.com/yihui/knitr/blob/42f6b3ad75f8245d26c1de3d3e0b9fa018566af3/R/block.R#L205-L216

but it seems to be different here... I'll look into that.

cderv commented 9 months ago

Interestingly, more minimal reprex of this

---
title: "Untitled"
output: 
  html_document:
    keep_md: true
---

```{r}
dev.new()
dev.off()
# hello

![image](https://github.com/yihui/knitr/assets/6791940/d009f712-ef77-49c7-aea4-75bd174e58b1)

So I am thinking something in **evaluate** rather than **knitr** directly.  This is what we results in `evaluate::evaluate()` when debugging interactively
````r
> str(res, max.level = 2)
List of 6
 $ :List of 1
  ..$ src: chr "dev.new()\n"
  ..- attr(*, "class")= chr "source"
 $ : chr "debug: options$message\n"
 $ :List of 1
  ..$ src: chr "dev.off()\n"
  ..- attr(*, "class")= chr "source"
 $ : chr "RStudioGD \n        2 \n"
 $ :List of 1
  ..$ src: chr "# hello"
  ..- attr(*, "class")= chr "source"
 $ :List of 2
  ..$ :Dotted pair list of 8
  ..$ : raw [1:35992] 01 00 00 00 ...
  .. ..- attr(*, "pkgName")= chr "graphics"
  ..- attr(*, "engineVersion")= int 16
  ..- attr(*, "pid")= int 51232
  ..- attr(*, "Rversion")=Classes 'R_system_version', 'package_version', 'numeric_version'  hidden list of 1
  ..- attr(*, "load")= chr(0) 
  ..- attr(*, "attach")= chr(0) 
  ..- attr(*, "class")= chr "recordedplot"

So it seems that something evaluated after dev.off will trigger the recordedplot to be among value returned by evaluate::evaluate()

dmurdoch commented 9 months ago

Maybe there's a way for knitr to make that plot invisible to evaluate at the start of the run? I think the plot is being captured here: https://github.com/r-lib/evaluate/blob/00192335c338ac83af200dcee35c56d807b8b28b/R/eval.R#L165-L175

cderv commented 9 months ago

Thanks ! I was just looking into those hooks. It seems indeed directly related.

cderv commented 9 months ago

TL;DR;

This happens because dev.off() is called in a chunk, and it will set dev.cur() to the next device, which happens to be the one opened before knitting. evaluate will then capture plot in this device, which includes the existing one.

Not sure how to prevent this, without a dev.off() for any existing device before knitting.

Details about investigations

Definitely something at the evaluate level

> plot(1, main = "Not in the document")
> res = evaluate::evaluate("dev.new()\ndev.off()\nplot.new()\nplot(1:10, 1:10)", new_device = FALSE)
> str(res, max.level = 2)
List of 9
 $ :List of 1
  ..$ src: chr "dev.new()\n"
  ..- attr(*, "class")= chr "source"
 $ : chr "NULL\n"
 $ :List of 1
  ..$ src: chr "dev.off()\n"
  ..- attr(*, "class")= chr "source"
 $ : chr "RStudioGD \n        2 \n"
 $ :List of 1
  ..$ src: chr "plot.new()\n"
  ..- attr(*, "class")= chr "source"
 $ :List of 2
  ..$ :Dotted pair list of 8
  ..$ : raw [1:35992] 01 00 00 00 ...
  .. ..- attr(*, "pkgName")= chr "graphics"
  ..- attr(*, "engineVersion")= int 16
  ..- attr(*, "pid")= int 53184
  ..- attr(*, "Rversion")=Classes 'R_system_version', 'package_version', 'numeric_version'  hidden list of 1
  ..- attr(*, "load")= chr(0) 
  ..- attr(*, "attach")= chr(0) 
  ..- attr(*, "class")= chr "recordedplot"
 $ :List of 2
  ..$ :Dotted pair list of 2
  ..$ : raw [1:35992] 01 00 00 00 ...
  .. ..- attr(*, "pkgName")= chr "graphics"
  ..- attr(*, "engineVersion")= int 16
  ..- attr(*, "pid")= int 53184
  ..- attr(*, "Rversion")=Classes 'R_system_version', 'package_version', 'numeric_version'  hidden list of 1
  ..- attr(*, "load")= chr(0) 
  ..- attr(*, "attach")= chr(0) 
  ..- attr(*, "class")= chr "recordedplot"
 $ :List of 1
  ..$ src: chr "plot(1:10, 1:10)"
  ..- attr(*, "class")= chr "source"
 $ :List of 2
  ..$ :Dotted pair list of 8
  ..$ : raw [1:35992] 01 00 00 00 ...
  .. ..- attr(*, "pkgName")= chr "graphics"
  ..- attr(*, "engineVersion")= int 16
  ..- attr(*, "pid")= int 53184
  ..- attr(*, "Rversion")=Classes 'R_system_version', 'package_version', 'numeric_version'  hidden list of 1
  ..- attr(*, "load")= chr(0) 
  ..- attr(*, "attach")= chr(0) 
  ..- attr(*, "class")= chr "recordedplot"

It could even be simplified to this

> plot(1, main = "Not in the document")
> res = evaluate::evaluate("# hello", new_device = FALSE)
> str(res, max.level = 2)
List of 2
 $ :List of 1
  ..$ src: chr "# hello"
  ..- attr(*, "class")= chr "source"
 $ :List of 2
  ..$ :Dotted pair list of 8
  ..$ : raw [1:35992] 01 00 00 00 ...
  .. ..- attr(*, "pkgName")= chr "graphics"
  ..- attr(*, "engineVersion")= int 16
  ..- attr(*, "pid")= int 43180
  ..- attr(*, "Rversion")=Classes 'R_system_version', 'package_version', 'numeric_version'  hidden list of 1
  ..- attr(*, "load")= chr(0) 
  ..- attr(*, "attach")= chr(0) 
  ..- attr(*, "class")= chr "recordedplot"

it does not come from the hooks IMO but from handle output: https://github.com/r-lib/evaluate/blob/00192335c338ac83af200dcee35c56d807b8b28b/R/eval.R#L151-L158

This handle_output() is also called on last item (https://github.com/r-lib/evaluate/blob/00192335c338ac83af200dcee35c56d807b8b28b/R/eval.R#L264-L267). evaluate will always capture the content of the graphic device when evaluating the last element. I believe this is expected feature as doc says:

It stores the final result, whether or not it should be visible, and the contents of the current graphics device.

I believe this is what happens:

So we can simulate with

# no device at start
dev.list()
#> NULL
# creating a plot in console opens the device 
plot(1, main = "Not in the document")
dev.list()
#> RStudioGD       png 
#>        2         3 
dev.cur()
#> RStudioGD 
#>        2

# knitr save the opened device
dv0 = dev.cur(); dv0
#> RStudioGD 
#>        2 
# and creates a new one 
grDevices::png()
dv = dev.cur(); dv
#> png 
#>   4 

# then code is evaluated. 
dev.new() # creating a new device 
dev.cur() # this modifies the dev cur
#> windows 
#>      5 
dev.off() # calling from the chunk 
#> RStudioGD 
#>         2 
dev.cur()
#> RStudioGD 
#>       2 

And we see here that after dev.off(), the current device has been reset to the one from the created plot before knitting. And so evaluate will capture the plot already in that device, skipping the device opened by knitr.

dev.off() shutdown active device, and setup the current to be the next, where here we would have like dev.prev() instead. From ?dev.off():

dev.off shuts down the specified (by default the current) device. If the current device is shut down and any other devices are open, the next open device is made current.

Issue here is that, calling dev.off() inside a knitr chunk will set the device to the next one, and if one has been opened before knitting happens in will be that one.

I am not sure how we can prevent this for happening exactly, without closing any existing device before rendering. 🤔

dmurdoch commented 9 months ago

Thanks, I agree with your analysis. So it's really user error in the document: I shouldn't expect dev.off() to return me to the previous state. I should do something like

prev <- dev.prev()
dev.off()
dev.set(prev)

and indeed when I do that, things are fine. (I don't know why dev.off() doesn't already work that way. But I expect it's been this way for a very long time.)

yihui commented 9 months ago

Thanks @dmurdoch for the report and @cderv for the investigation! I just pushed a fix in the evaluate package, so it should be safe to use dev.off() in knitr code chunks now.

cderv commented 9 months ago

Oh good solution @yihui ! Thank you !

github-actions[bot] commented 3 months ago

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.