Closed philipp-baumann closed 5 years ago
Now I just restarted R, reinstalled with renv::install("rstudio/renv@0.7.0-111")
, but "xtable" is still not in the renv.lock
. Sorry forgot to post this file in last comment, but here it is.
library(renv)
#>
#> Attaching package: 'renv'
#> The following object is masked from 'package:stats':
#>
#> update
#> The following objects are masked from 'package:utils':
#>
#> history, upgrade
#> The following objects are masked from 'package:base':
#>
#> load, remove
isolate()
purge("xtable")
#> * The requested package is not installed in the cache -- nothing to do.
install("xtable")
#> Retrieving 'https://cloud.r-project.org/src/contrib/xtable_1.8-4.tar.gz' ...
#> OK [file is up to date]
#> Installing xtable [1.8-4] from CRAN ...
#> OK (built from source)
.libPaths()
#> [1] "/media/ssd/nas-ethz/doktorat/projects/01_spectroscopy/52_swr-spc/renv/library/R-3.6/x86_64-pc-linux-gnu"
#> [2] "/tmp/RtmpGYUwmf/renv-system-library"
#> [3] "/usr/lib/R/library"
Created on 2019-10-06 by the reprex package (v0.3.0)
renv should't use the cache for this project (this command failed in reprex::reprex()
):
> renv::settings$use.cache()
[1] FALSE
> renv::snapshot()
* The lockfile is already up to date.
However, using reprex::reprex()
, I get
library(renv)
#>
#> Attaching package: 'renv'
#> The following object is masked from 'package:stats':
#>
#> update
#> The following objects are masked from 'package:utils':
#>
#> history, upgrade
#> The following objects are masked from 'package:base':
#>
#> load, remove
snapshot()
#> The following required packages are not installed:
#>
#> reshape [required by GGally]
#> reshape2 [required by broom, caret, Cubist, and 2 others]
#> rio [required by car]
#> robCompositions [required by mvoutlier]
#> robustbase [required by cvTools, fpc, mvoutlier]
#> rprojroot [required by here]
#> scales [required by cowplot, ggplot2]
#> sgeostat [required by mvoutlier]
#> sp [required by maptools]
#> SparseM [required by quantreg]
#> SQUAREM [required by lava]
#> testthat [required by hyperSpec]
#> tibble [required by broom, cellranger, dbplyr, and 10 others]
#> tidyr [required by broom, modelr, recipes, and 2 others]
#> tidyselect [required by dbplyr, dplyr, recipes, and 2 others]
#> tidyverse [required by simplerspec]
#> timeDate [required by recipes]
#> vctrs [required by hms, pillar]
#> viridisLite [required by ggplot2]
#> XML [required by hyperSpec]
#> zoo [required by lmtest]
#>
#> Consider re-installing these packages before snapshotting the lockfile.
#> Error in snapshot(): aborting snapshot due to pre-flight validation failure
Created on 2019-10-06 by the reprex package (v0.3.0)
Am I missing anything?
Because some of the packages are shared via cache and not in the project library, they are not listed in renv.lock. As a consequence, they were not installed when building the Docker image.
I don't quite follow -- these concepts are somewhat independent. Packages enter the lockfile if they are (1) installed in the project library, and (2) used somewhere in the project. Packages installed in the library may either be 'real' package installs, or may be symlinks back into the renv
package cache. So whether a package is used from the cache or not should not affect whether a package enters the lockfile.
The error you reported looks like a bug in renv
, though -- the isolation code was assuming that we'd always be able to find a package in the cache to copy back to the library, but that may not always be true. I've pushed a candidate fix to master.
My best guess: the reprex()
example is failing because the library paths are different from your "regular" R session versus the reprex()
session.
Note that here:
.libPaths()
#> [1] "/media/ssd/nas-ethz/doktorat/projects/01_spectroscopy/52_swr-spc/renv/library/R-3.6/x86_64-pc-linux-gnu"
#> [2] "/tmp/RtmpfJMfUR/renv-system-library"
#> [3] "/usr/lib/R/library"
I would not expect to see /usr/lib/R/library
on the library paths. Could that be related?
I don't quite follow -- these concepts are somewhat independent. Packages enter the lockfile if they are (1) installed in the project library, and (2) used somewhere in the project. Packages installed in the library may either be 'real' package installs, or may be symlinks back into the renv package cache. So whether a package is used from the cache or not should not affect whether a package enters the lockfile.
Thanks for the clarification. Sorry for my wrong statement. I somehow accidentally confused things because renv was not behaving as expected, although key design principles (1) and (2) can be deduced from the renv introduction. Maybe an explicit "symlink" mention could be incorporated into this starting resource?
I just upgraded to renv v0.7.0-129 using
> renv::upgrade(version = "0.7.0-129")
A new version of the renv package will be installed:
[0.6.0-61] -> [0.7.0-129]
This project will use the newly-installed version of renv.
I'm not sure if this bug was actually related to the problem described here, but nice you fixed that anyway! I checked again and "xtable" is located in the renv project library, but still not in renv.lock
after another round of
> renv::install("xtable")
Retrieving 'https://cloud.r-project.org/src/contrib/xtable_1.8-4.tar.gz' ...
OK [file is up to date]
Installing xtable [1.8-4] ...
OK (built from source)
> renv::snapshot()
* The lockfile is already up to date.
Also, xtable is used in the R scripts in the project directory used for drake::code_to_plan()
. This is the updated gist for renv.lock
.
My best guess: the
reprex()
example is failing because the library paths are different from your "regular" R session versus thereprex()
session.Note that here:
.libPaths() #> [1] "/media/ssd/nas-ethz/doktorat/projects/01_spectroscopy/52_swr-spc/renv/library/R-3.6/x86_64-pc-linux-gnu" #> [2] "/tmp/RtmpfJMfUR/renv-system-library" #> [3] "/usr/lib/R/library"
I would not expect to see
/usr/lib/R/library
on the library paths. Could that be related?
Correct, in the project directory I get:
> .libPaths()
[1] "/media/ssd/nas-ethz/doktorat/projects/01_spectroscopy/52_swr-spc/renv/library/R-3.6/x86_64-pc-linux-gnu"
[2] "/tmp/RtmpQwcIHo/renv-system-library"
Is this really inteded behaviour of reprex::reprex()
in renv project context to show different .libPaths()
compared to a "regular" R session? If yes, it's a bit confusing because that would compromise the core idea of reprex reproducibility.
BTW isolate()
is missing in the renv pagedown reference. Would be a nice addition there because function is exported.
Thanks a lot for your help!
In essence, packages will enter the lockfile if they're reported as part of the packages in:
renv::dependencies()
I suspect that some packages are being used in your project in a way that renv
fails to discover. Can you share your project sources, so I can see exactly how xtable
is declared / used in your project? We might need to add support for how packages might be referenced or used in drake
pipelines.
The behavior with reprex
is likely a bug -- most likely, I will have to write a PR to reprex
to see if renv
environments can be explicitly supported.
Ah I see, as you say that might indeed be related to the way I load packages; here is the _setup-run-all.R
that loads packages and functions prior to planning the drake workflow and building the targets:
## Load packages
pkgs <- c("here", "drake", "tidyverse", "data.table",
"simplerspec", "caret", "Cubist", "rsample", "nls.multstart",
"broom", # modeling
"future", "future.apply", "doParallel", "doFuture", # asynchronous computation
"gghighlight", "grid", "gridExtra", "cowplot", # graphics
"xtable") # tables
purrr::walk(pkgs, library, character.only = TRUE)
Also, as an example of a script of the workflow 70_collect-swr-params.R
. xtable is loaded in l. 534 for example.
Let me know if you want to see the rest of the project (cannot share the data publicly as I don't own, but could invite you to the private repo).
That would explain it! Unfortunately renv
dependency discovery system is not nearly smart enough to understand this.
If you rewrite your package usages with another form, e.g. plain old
library(here)
library(drake)
< ... >
then renv
will be able to pick it up.
The behavior with reprex is likely a bug -- most likely, I will have to write a PR to reprex to see if renv environments can be explicitly supported.
If I reprex::reprex()
this code:
getwd()
.libPaths()
inside an renv-using project, I see:
getwd()
#> [1] "/private/var/folders/yx/3p5dt4jj1019st0x90vhm9rr0000gn/T/RtmpVGI5y1/reprexecfb5cc375e3"
.libPaths()
#> [1] "/Users/jenny/rrr/stat545/renv/library/R-3.6/x86_64-apple-darwin15.6.0"
#> [2] "/private/var/folders/yx/3p5dt4jj1019st0x90vhm9rr0000gn/T/RtmpVGI5y1/renv-system-library"
#> [3] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library"
Created on 2019-10-07 by the reprex package (v0.3.0)
Which seems correct.
@philipp-baumann Can you say more about what you're seeing?
That would explain it! Unfortunately
renv
dependency discovery system is not nearly smart enough to understand this.If you rewrite your package usages with another form, e.g. plain old
library(here) library(drake) < ... >
then
renv
will be able to pick it up.
Sure, that makes sense. Thanks! This now works :-) (maybe nice for a future version; I'd have to dig in more into the code base of renv to be able to contribute with code; maybe some time in future)
> renv::snapshot()
The following package(s) will be updated in the lockfile:
# CRAN ===============================
- xtable [* -> 1.8-4]
Do you want to proceed? [y/N]: y
* Lockfile written to '/media/ssd/nas-ethz/doktorat/projects/01_spectroscopy/52_swr-spc/renv.lock'.
The behavior with reprex is likely a bug -- most likely, I will have to write a PR to reprex to see if renv environments can be explicitly supported.
If I
reprex::reprex()
this code:getwd() .libPaths()
inside an renv-using project, I see:
getwd() #> [1] "/private/var/folders/yx/3p5dt4jj1019st0x90vhm9rr0000gn/T/RtmpVGI5y1/reprexecfb5cc375e3" .libPaths() #> [1] "/Users/jenny/rrr/stat545/renv/library/R-3.6/x86_64-apple-darwin15.6.0" #> [2] "/private/var/folders/yx/3p5dt4jj1019st0x90vhm9rr0000gn/T/RtmpVGI5y1/renv-system-library" #> [3] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library"
Created on 2019-10-07 by the reprex package (v0.3.0)
Which seems correct.
@philipp-baumann Can you say more about what you're seeing?
Thanks for jumping in @jennybc
Here is what I see:
getwd()
#> [1] "/tmp/Rtmphm0Ct4/reprexeed9548c51"
.libPaths()
#> [1] "/media/ssd/nas-ethz/doktorat/projects/01_spectroscopy/52_swr-spc/renv/library/R-3.6/x86_64-pc-linux-gnu"
#> [2] "/tmp/Rtmphm0Ct4/renv-system-library"
#> [3] "/usr/lib/R/library"
Created on 2019-10-07 by the reprex package (v0.3.0)
The point is that the default system library (.Library
) should not be showing up on the library paths. That is:
#> [3] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library"
should not be there when the renv
sandbox is activated. Note that this magic is done by renv
when the session is launched through the .Rprofile
, so if reprex
is launching a child process with e.g. --vanilla
, that would probably explain why this is happening.
It looks like reprex
launches its child processes with callr::r_safe()
:
https://github.com/tidyverse/reprex/blob/f888a72a90f39c0dcb9564b6d94fee094b6fd342/R/reprex.R#L403
which intentionally does not load the .Rprofile
. So, I believe this is ultimately just renv
doing something that reprex
did not / could not anticipate, since so much of the renv
startup magic happens in the project .Rprofile
.
But the project's .Rprofile
does seem to have been consulted? Otherwise, I don't understand how/why both @philipp-baumann and I have the first 2 lib paths that we have. I don't know why we both have the system library in the 3rd position 🤔
The docs for callr::r_safe()
(which is now just an alias for callr::r()
, but that wasn't true when I first started using it) talk about the system and user .Rprofile
. But they're pretty silent about a project-level .Rprofile
.
I believe this is because callr::r_safe()
does pass along the current library paths, e.g.
> .libPaths(c(tempdir(), .libPaths()))
> callr::r_safe(function() { print(.libPaths()) })
[1] "/private/var/folders/b4/2422hswx71z8mgwtv4rhxchr0000gn/T/RtmpP8GKVV" "/Users/kevinushey/Library/R/3.6/library"
[3] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library"
But renv
does some black magic to mutate .Library
and .Library.site
(sandboxing them so user packages installed in these libraries are not visible to renv
projects), and that magic does not propagate by default.
Perhaps more to the point:
> owd <- setwd(tempdir())
> writeLines("x <- 42", ".Rprofile")
> callr::r_safe(function() { print(x) })
Error: callr subprocess failed: object 'x' not found
> callr::r_copycat(function() { print(x) })
[1] 42
I can imagine how to create something worthy of the name reprex_renv()
, meaning reprex this code HERE in this renv-using project. But I'm not sure if the demand is high enough to justify it? In any case, if that seems worth contemplating, we could track it over an issue on reprex.
I think the underlying issue here is now understood + resolved (renv
's dependency discovery machinery failing to understand the way packages were loaded in this project).
Also worth stating:
Also, as an example of a script of the workflow 70_collect-swr-params.R . xtable is loaded in l. 534 for example.
In that example, the function xtable()
is used, but the package itself is not referenced or loaded. In other words, renv
doesn't really know (just from static analysis) that xtable
is a function that is being provided by the xtable
package. You could also qualify the usage; e.g.
xtable::xtable(...)
and in this case renv
would detect that usage.
Great, thanks @kevinushey and @jennybc for digging into it and giving detailed infos!
Dear Kevin, I'm a big fan of renv and I'm using it in combination with drake and Docker to ensure reproducibility for my scientific projects and foster collaboration. I'm at the moment experimenting with the Docker configuration option 1 you nicely describe in Using renv with Docker. Because some of the packages are shared via cache and not in the project library, they are not listed in
renv.lock
. As a consequence, they were not installed when building the Docker image.To this purpose I tried using
renv::isolate()
introduced in 4ca213faf29eabc1a38fc24f8f6d51f9a5d7ce27 to move these packages to the project library. I executed this on my local machine to prepare the newrenv.lock
for Docker.Unfortunately, it failed and I couldn't figure out why it cannot move renv:
Created on 2019-10-06 by the reprex package (v0.3.0)
I'm using renv v0.7.0-111, which I also installed in the global library of my local machine. I have the following in my
.Rprofile
:The Dockerfile is in this gist.
Thanks a lot for the great work done here and some hints to resolve the issue. Best, Philipp