Closed lmguzman closed 8 years ago
Hmmm - I see gapminder has been bumped to 0.2.0, but does that mean the 0.1.0 link dies? Weird. Anyway, no reason not to update to 0.2.0 for now, but maybe look into a more stable solution - suggestions welcome.
How about
R -e "install.packages('gapminder', repos = 'http://cran.us.r-project.org')"
?
Yeah, that works!
I am not sure what the best solution for installing R packages with a dockerfile is.
Essentially if you use:
RUN wget https://cran.r-project.org/src/contrib/gapminder_0.2.0.tar.gz
RUN R CMD INSTALL gapminder_0.2.0.tar.gz
you can specify versions, but you are HOOPED if the package has dependencies! They will not be installed with this method. If you use the other method shown above:
R -e "install.packages('gapminder', repos = 'http://cran.us.r-project.org')"
you cannot pick the version, it will just automatically grab the most recent version. This really goes against my philosophy for using Docker... But if there is not a better solution, I suggest we go with the R -e "install.packages('gapminder', repos = 'http://cran.us.r-project.org')"
option as at least dependencies are installed that way. @cboettig do you know a better solution?
You can install from the mran snapshots of cran instead, eg: set repo to "https://mran.revolutionanalytics.com/snapshot/2015-10-07" to install from that date.
Also take a look at the 'checkpoint' and 'packrat' packages to lock versions of your library in a more fine-grained way (rather than pinning to a particular date), more like gemfile.lock in Ruby.
'packrat' is already installed on the hadleyverse image, so this would be a good option. I also like the snapshot solution. Don't know wich is easier / more valuable for students.
As I just got pulled in here:
RUN wget https://cran.r-project.org/src/contrib/gapminder_0.2.0.tar.gz
RUN R CMD INSTALL gapminder_0.2.0.tar.gz
That is wrong on two or more counts: a) you want to combine RUN
statements and b) we have install.r
to actually fetch a package by name (and current version) from CRAN (see below) and c) you generally do NOT want to hardwire a version number as R resolves that for you.
Quick demo:
$ install.r gapminder
trying URL 'https://cran.rstudio.com/src/contrib/gapminder_0.2.0.tar.gz'
Content type 'application/x-gzip' length 243216 bytes (237 KB)
==================================================
downloaded 237 KB
* installing *source* package ‘gapminder’ ...
** package ‘gapminder’ successfully unpacked and MD5 sums checked
** R
** data
*** moving datasets to lazyload DB
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (gapminder)
The downloaded source packages are in
‘/tmp/downloaded_packages’
$
That idiom is used all over the Rocker project Dockerfiles. We do not aim to snapshot particular dates, versions, vintages, releases, but @gmbecker can advertise his snapshotting solution. As @cboettig mentions above, there are others too.
@cboettig I do like the idea of installing from the mran snapshots of cran for a dockerfile. That would cover both version specificity, as well as package dependencies. Great suggestion. @BillMills @HeidiSeibold what do you think? Should we go with that for installing R packages for the dockerfile lesson?
@eddelbuettel I agree that pinning versions is trouble for package management - but it's exactly what I do want when making a docker container - otherwise the same Dockerfile can generate two different containers depending on when you run it which means the Dockerfile is no longer a good document of what's in the container.
The ideal solution IMO resolves dependencies correctly, remains stable in time, and produces a clear document of what is in the container. @ttimbers @cboettig, your date-pinned solution seems to do the first two, but how do you recommend documenting the actual package versions that end up getting installed by this method? There's probably a convenient way to do this that R superheroes such as yourselves know about - I would want that included in the lesson, since I care way way more about what your version numbers are than what mran was doing on this date in history.
The official 'R on Docker' container (ie rocker:r-base as well as just r-base) pins as well.
There is a time and place for it. It just so happens that it is not the default use for Carl or myself. Our aim is not frozen-in-time configs. If you want to freeze a setup, keep the container.
This may sound flippant but doing otherwise engages a steep uphill battle against both R's distribution model (CRAN == always current) and the Linux distros (ditto).
@BillMills If you just want a list of the specific versions of everything that is installed, see the R function installed.packages()
(from utils
).
The checkpoint
and packrat
packages try to provide a more portable way to share this information, e.g. if you want collaborators in a non-dockerized environment to just replicate your package suite.
@cboettig ok, I'm sold - checkpoint/mran + installed.packages()
is a good solution for reproducibility and clear documentation. So, just to be excruciatingly pedantic, the recommendation is:
RUN R -e "install.packages('gapminder', repos = 'https://mran.revolutionanalytics.com/snapshot/2015-10-07')"
or whatever date, in the Dockerfile, with the understanding that installed.packages()
inside the container will prevent the need for too much dep spelunking. Looks good to me!
@BillMills sounds good to me.
Though for any given R script / Rmd file it usually more practical to just recommend users just report the output of a call to sessionInfo()
at the end of their script, rather than installed.packages()
.
As you may know, sessionInfo()
will not only name the versions of packages that were actually loaded in that analysis, but also list other relevant information for debugging, such as platform architecture and locale info.
I think we've settled on RUN R -e "install.packages...
per 06
and #28.
I want to use gapminder packages, but it keeps showing me "there is no package called 'gapminder'" please what is the way out?
There is: https://cloud.r-project.org/web/packages/gapminder/index.html
But it has Depends: R (≥ 3.1.0)
. Is your R older than that?
Yes am using 3.4.4.
Well:
R> R.version.string
[1] "R version 3.4.4 (2018-03-15)"
R> install.packages("gapminder")
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/gapminder_0.3.0.tar.gz'
Content type 'application/x-gzip' length 2110951 bytes (2.0 MB)
==================================================
downloaded 2.0 MB
* installing *source* package ‘gapminder’ ...
** package ‘gapminder’ successfully unpacked and MD5 sums checked
** R
** data
*** moving datasets to lazyload DB
** inst
** preparing package for lazy loading
** help
*** installing help indices
*** copying figures
** building package indices
** testing if installed package can be loaded
* DONE (gapminder)
The downloaded source packages are in
‘/tmp/RtmpvGUPAg/downloaded_packages’
R>
Thanks for the tips. On Mar 29, 2018 4:34 PM, "Dirk Eddelbuettel" notifications@github.com wrote:
Well:
R> R.version.string [1] "R version 3.4.4 (2018-03-15)" R> install.packages("gapminder") Installing package into ‘/usr/local/lib/R/site-library’ (as ‘lib’ is unspecified) trying URL 'https://cloud.r-project.org/src/contrib/gapminder_0.3.0.tar.gz' Content type 'application/x-gzip' length 2110951 bytes (2.0 MB)
downloaded 2.0 MB
- installing source package ‘gapminder’ ... package ‘gapminder’ successfully unpacked and MD5 sums checked R data moving datasets to lazyload DB inst preparing package for lazy loading help installing help indices * copying figures building package indices ** testing if installed package can be loaded
- DONE (gapminder)
The downloaded source packages are in ‘/tmp/RtmpvGUPAg/downloaded_packages’ R>
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ropenscilabs/r-docker-tutorial/issues/9#issuecomment-377275277, or mute the thread https://github.com/notifications/unsubscribe-auth/AkJSivj_FZJ6Bwe2f-uN_cRjvpMuZGl3ks5tjP7_gaJpZM4ILHz1 .
On the dockerfile lesson the gapminder link https://cran.r-project.org/src/contrib/gapminder_0.1.0.tar.gz is not working for me :(. Gives me a not found