yihui / knitr

A general-purpose tool for dynamic report generation in R
https://yihui.org/knitr/
2.36k stars 873 forks source link

knitr cache does not recognize magrittr assignment pipe #2323

Closed stancari closed 4 months ago

stancari commented 5 months ago

Hi Yihui, thanks for your extraordinary work.

It looks like knitr does not recognize the magrittr assignment pipe operator %<>% when deciding which new objects created in a chunk need to be cached.

In the following minimal example, the number x and the tibble Data are correctly calculated, while DF (which should be identical to Data) is not.

After invalidating the cache (by changing the initial value of x, for instance, or the length of u) everything is OK. Running the same code a second time, x and Data are calculated correctly, while DF is not, giving an error:

Error in `mutate()`:
In argument: `w = 3 * v`.
Caused by error:
! object 'v' not found

Minimal example:

```{r reset, eval=TRUE, include=FALSE, cache=FALSE}
graphics.off()
rm(list=ls())
# R setup
library(knitr)
library(magrittr)
library(tidyverse)

# knitr setup
knitr::opts_chunk$set(echo = FALSE
                     ,error = TRUE
                     ,warning = FALSE
                     ,message = FALSE
                     ,progress = TRUE
                     ,verbose = TRUE
                     ,cache = FALSE
                     ,autodep = FALSE
                     ,cache.path = 'cache_knitr/'
                     ,cache.lazy = FALSE
                     ,cache.comments = FALSE
                     ,cache.globals = FALSE
)
x = 1
Data = DF = tibble(u = seq(1, 9, length.out=3))
x = x + 1
Data <- Data |> mutate(v = 2*u)
DF %<>% mutate(v = 2*u)
x = x + 1
Data <- Data |> mutate(w = 3*v)
DF %<>% mutate(w = 3*v)
cat('x = ', x, '\n')
kable(Data)
kable(DF)

Session info:

R version 4.3.1 (2023-06-16) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Ventura 13.6.3, RStudio 2023.6.1.524

Locale: en_US.UTF-8 / en_US.UTF-8 / en_US.UTF-8 / C / en_US.UTF-8 / en_US.UTF-8

Package version: evaluate_0.23 graphics_4.3.1 grDevices_4.3.1 highr_0.10 knitr_1.45 methods_4.3.1
stats_4.3.1 tools_4.3.1 utils_4.3.1 xfun_0.41 yaml_2.3.8



<!--
Please keep the below portion in your issue. Your issue will be closed if any of the boxes is not checked (i.e., replace `[ ]` by `[x]`). In certain (rare) cases, you may be exempted if you give a brief explanation (e.g., you are only making a suggestion for improvement). Thanks!
-->

---

By filing an issue to this repo, I promise that

- [x] I have fully read the issue guide at https://yihui.org/issue/.
- [x] I have provided the necessary information about my issue.
    - If I'm asking a question, I have already asked it on Stack Overflow or RStudio Community, waited for at least 24 hours, and included a link to my question there.
    - If I'm filing a bug report, I have included a minimal, self-contained, and reproducible example, and have also included `xfun::session_info('knitr')`. I have upgraded all my packages to their latest versions (e.g., R, RStudio, and R packages), and also tried the development version: `remotes::install_github('yihui/knitr')`.
    - If I have posted the same issue elsewhere, I have also mentioned it in this issue.
- [x] I have learned the Github Markdown syntax, and formatted my issue correctly.

I understand that my issue may be closed if I don't fulfill my promises.
yihui commented 4 months ago

In this case, you have to manually specify the variables to be cached, using the cache.vars option, which is provided exactly for this type of scenario: https://yihui.org/knitr/options/#cache

Technically, knitr uses codetools to detect local variables in a code chunk, and the automatic detection has no knowledge about the %<>% operator, so it can only detect variables created via normal assignments, i.e., x and Data but not DF.

codetools::findLocalsList(parse(text = '
x = x + 1
Data <- Data |> mutate(v = 2*u)
DF %<>% mutate(v = 2*u)
'))
stancari commented 4 months ago

Makes sense. Thank you.