rstudio / packrat

Packrat is a dependency management system for R
http://rstudio.github.io/packrat/
402 stars 90 forks source link

Managing multiple R instances? #342

Open ajmazurie opened 8 years ago

ajmazurie commented 8 years ago

Hi, I'm aware this is probably out of scope for this project but it is close enough I thought it would be the perfect place to ask the question. Is there a way for Packrat to manage not just packages but R distributions as well, in a way similar to Pyenv for Python?

jckane commented 8 years ago

I am also interested in the answer to this question. Packrat is a really neat tool for solving the horrible R package dependency problem. However, the version of base R used is also crucial for ensuring reproducibility. What recommendations do people have for managing the base R version with Packrat?

colearendt commented 7 years ago

If I understand properly (probably an unsafe assumption), packrat does store packages based on the version of R installed (i.e. ./packrat/lib//x86_64-pc-linux-gnu/3.4.1/). Further, this is loaded first in .libpaths().

So the real trouble comes with ensuring that the correct version of R is executing, as that should ensure you get the right package versions. Some systems (i.e. Windows) will store multiple versions of R by default (in C:\Program Files\R\R-3.4.1, for instance), whereas other systems (i.e. linux) you have to work a bit harder since default on upgrade is to write in the same directory (more info on how to do that in the link below).

If you can successfully ensure that R execution is the version that you expect, then you should be safe to ensure consistent package versions. That can be as easy as storing a shell script in the local directory that points at a specific R or Rscript binary, or some method of ensuring that an RStudio RProject will choose the correct version (see this article for further reading and feature requests here and here).

For instance, if I always execute C:\Program Files\R\R-3.4.1\bin\Rscript.exe in my local directory, I should be safe to ensure that I will always get a consistent execution environment when using packrat. I ultimately see this as a work-around, but best I can tell, it seems that R does not have the same pythonic "look for a binary locally first" behavior that allows Pyenv to easily handle python version. As a result, this might be an R feature request as much as a packrat one.

The latter would be the more elegant solution, and then R version (or path to the binary, or a copy of the binary) could perhaps be stored in the packrat.lock file, much like the python equivalent. For those with more scripting knowledge than I have, I think there should be a way to devise a robust solution to this common issue.

NOTE - I still need to test my approach above. Thus far, I have mostly been in a planning phase.