rstudio / packrat

Packrat is a dependency management system for R
http://rstudio.github.io/packrat/
401 stars 89 forks source link

[WIP] bundle renv sources in packrat #646

Closed kevinushey closed 2 years ago

kevinushey commented 2 years ago

This PR bundles renv's sources into Packrat, primarily so that Packrat can depend on some of the tooling in renv for dependency discovery. This is done via a helper script, tools/tools-bundle-renv.R, which basically clones and copies in the renv R sources to inst/resources/renv.R, which are then sourced into an renv R environment hosted in the Packrat package namespace.

Because all of renv's functions live in their own environment, we shouldn't need to worry about potential collisions between these functions and anything already defined in Packrat, even those of the same name.

Packrat can then use functions from renv with something like:

renv$dependencies(...)

TODO

aronatkins commented 2 years ago

There are calls to renv_zzz_run() and renv_license_generate() from within the bundled renv source.

Both of these calls appear to be code that we want to run during renv package-building, but not in this situation.

Those may be the only ones.

grep -E '^\w+\(\)$' inst/resources/renv.R
#> renv_license_generate()
#> renv_zzz_run()

The call to renv_zzz_run() fails because the R/bootstrap file does not exist within packrat and prevents the package from being built. Commenting out the call to renv_zzz_bootstrap() lets the (source and binary) package build succeed (the remainder of renv_zzz_run() was allowed to run).

aronatkins commented 2 years ago

All top-level executable code for review:

https://github.com/rstudio/packrat/blob/62e05659a229cc8934e9ea638747e62ee2813dbd/inst/resources/renv.R#L1060-L1065

https://github.com/rstudio/packrat/blob/62e05659a229cc8934e9ea638747e62ee2813dbd/inst/resources/renv.R#L3435-L3444

https://github.com/rstudio/packrat/blob/62e05659a229cc8934e9ea638747e62ee2813dbd/inst/resources/renv.R#L11306

https://github.com/rstudio/packrat/blob/62e05659a229cc8934e9ea638747e62ee2813dbd/inst/resources/renv.R#L14920-L14926

https://github.com/rstudio/packrat/blob/62e05659a229cc8934e9ea638747e62ee2813dbd/inst/resources/renv.R#L21254-L21261

https://github.com/rstudio/packrat/blob/62e05659a229cc8934e9ea638747e62ee2813dbd/inst/resources/renv.R#L23555

https://github.com/rstudio/packrat/blob/62e05659a229cc8934e9ea638747e62ee2813dbd/inst/resources/renv.R#L25783

aronatkins commented 2 years ago

Top-level code found with:

grep -E '^\w+' inst/resources/renv.R | grep -v function
aronatkins commented 2 years ago

Dependency calls appear to be hitting the following environment variables and options:

call to Sys.getenv: RENV_CONFIG_FILEBACKED_CACHE
call to Sys.getenv: RENV_PROFILE
call to Sys.getenv: RENV_PROJECT
call to Sys.getenv: R_CMD
call to getOption: encoding
call to getOption: error
call to getOption: renv.config.filebacked.cache
call to getOption: renv.tests.running
call to getOption: warn

This may be an incomplete list, but was compiled by running the following against the Shiny example applications, Shiny gallery applications, and a variety of test apps that I have locally.

# code temporarily added to packrat/R/renv.R

renv$getOption <- function(x, default = NULL) {
  cat("call to getOption: ", x, "\n", sep = "")
  return(base::getOption(x, default))
}

renv$options <- function(...) {
  dots <- list(...)
  cat("call to options: ", paste0(names(dots), collapse = ", "), "\n", sep = "")
  return(do.call(base::options, dots))
}

renv$Sys.getenv <- function(x = NULL, unset = "", names = NA) {
  cat("call to Sys.getenv: ", x, "\n", sep = "")
  return(base::Sys.getenv(x, unset, names))
}

renv$Sys.setenv <- function(...) {
  dots <- list(...)
  cat("call to Sys.setenv: ", paste0(names(dots), collapse = ", "), "\n", sep = "")
  return(do.call(base::Sys.setenv, dots))
}
# scans for dependencies in each subdirectory of the working directory.
# we care about the "call to" output triggered by the code above.
dirs <- normalizePath(Filter(dir.exists, list.files(".", include.dirs = TRUE, no.. = TRUE)))

# something about options(error = call) was causing problems when the incoming 
# value was NULL; this seemed to resolve it.
errh <- function(...){}
options(error = as.call(list(errh)))

old <- options("packrat.dependency.discovery.renv" = TRUE)
for (dir in dirs) {
  cat("analyzing: ", dir, "\n", sep = "")
  packrat:::dirDependencies(dir)
}
options(old)
kevinushey commented 2 years ago

Going through all of these:

call to Sys.getenv: RENV_CONFIG_FILEBACKED_CACHE

By default, renv caches the results of dependency discovery based on file mtime, to ensure multiple calls to renv::dependencies() can be fast if files have not changed. This environment variable controls that; see https://rstudio.github.io/renv/reference/config.html for more details.

call to Sys.getenv: RENV_PROFILE

The RENV_PROJECT environment variable controls the active project; we may want to set this to the application root directory for the duration of the call.

call to Sys.getenv: R_CMD

This appears to be used during renv's license auto-generation; I can probably disable this for embedded use in Packrat.

call to getOption: encoding

This can be ignored.

call to getOption: error

I believe this can be ignored as well.

call to getOption: renv.config.filebacked.cache

See above RENV_CONFIG_FILEBACKED_CACHE.

call to getOption: renv.tests.running

This is used by renv when tests are running, mainly to make unit testing of some code pathways easier.

call to getOption: warn

I believe this can be ignored.


I'll take a look at the top-level executable code and see what might be worth changing.

aronatkins commented 2 years ago

Attempting to run tests while using renv dependency scanning; we have failures because the packrat::get_opt("ignored.directories") are not known by the renv code. Is there a way to add in-memory ignore rules in addition to a .renvignore?