Closed drorata closed 4 years ago
Current the versioned stack installs R from source. Installing binaries will install a different version of R with it's own site library and packages, see https://github.com/rocker-org/rocker-versioned#notes.
(We are about to have a new release that will be compatible with either binary or source installs).
And yes, installing directly from CRAN on linux means compiling from source, and this does take longer. However, you since you only need to build your derivative image once (or even just let Docker Hub auto-build it for you), the compile time is rarely a major obstacle.
(minor note: you can use the shorthand:
RUN install2.r --repos https:://cloud.r-project.org caret randomforest botor
in your Dockerfiles)
Thanks for the ideas --- I'll check them. I'm afraid I disagree with you about the building timing; when it comes to developing the image I find myself building over and over again...
No worries, we re-build images a ton on our end as well during image development! I find strategic use of docker build caching to pretty helpful for that.
You might try out our dev stack, which will be the new image rstudio / versioned stack going forward anyway starting with R 4.0.0. e.g. try:
FROM rockerdev/rstudio:3.6.3
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
r-cran-caret \
r-cran-randomforest
bug reports welcome!
@cboettig Using the snippet you posted as a Dockerfile
didn't work; here's what I got:
Sending build context to Docker daemon 5.632kB
Step 1/2 : FROM rockerdev/rstudio:3.6.3
3.6.3: Pulling from rockerdev/rstudio
5bed26d33875: Pull complete
f11b29a9c730: Pull complete
930bda195c84: Pull complete
78bf9a5ad49e: Pull complete
9092798a58a4: Pull complete
ca7288dc9e3d: Pull complete
908b0871d834: Pull complete
d90213265bfc: Pull complete
836a3efa052b: Pull complete
d354db935183: Pull complete
e9fb4a9db355: Pull complete
Digest: sha256:46252a09e49b9bc52ffaef5c1324c91c5d3c652ba4b47985a1e3b493c1ace845
Status: Downloaded newer image for rockerdev/rstudio:3.6.3
---> 8719902a01cd
Step 2/2 : RUN apt-get update && apt-get install -y --no-install-recommends r-cran-caret r-cran-randomforest
---> Running in 3b644f5d5b83
Hit:1 http://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease
Hit:2 http://archive.ubuntu.com/ubuntu bionic InRelease
Get:3 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Err:1 http://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease
At least one invalid signature was encountered.
Get:5 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
Err:2 http://archive.ubuntu.com/ubuntu bionic InRelease
At least one invalid signature was encountered.
Err:3 http://security.ubuntu.com/ubuntu bionic-security InRelease
At least one invalid signature was encountered.
Err:4 http://archive.ubuntu.com/ubuntu bionic-updates InRelease
At least one invalid signature was encountered.
Err:5 http://archive.ubuntu.com/ubuntu bionic-backports InRelease
At least one invalid signature was encountered.
Fetched 252 kB in 1s (437 kB/s)
Reading package lists...
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease: At least one invalid signature was encountered.
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://archive.ubuntu.com/ubuntu bionic InRelease: At least one invalid signature was encountered.
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://security.ubuntu.com/ubuntu bionic-security InRelease: At least one invalid signature was encountered.
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://archive.ubuntu.com/ubuntu bionic-updates InRelease: At least one invalid signature was encountered.
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://archive.ubuntu.com/ubuntu bionic-backports InRelease: At least one invalid signature was encountered.
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/bionic/InRelease At least one invalid signature was encountered.
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/bionic-updates/InRelease At least one invalid signature was encountered.
W: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/bionic-backports/InRelease At least one invalid signature was encountered.
W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/bionic-security/InRelease At least one invalid signature was encountered.
W: Failed to fetch http://cloud.r-project.org/bin/linux/ubuntu/bionic-cran35/InRelease At least one invalid signature was encountered.
W: Some index files failed to download. They have been ignored, or old ones used instead.
Reading package lists...
Building dependency tree...
Reading state information...
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
r-base-core : Breaks: r-cran-caret (< 6.0-84-2~) but 6.0-78+dfsg1-1 is to be installed
r-cran-caret : Depends: r-api-3.4
Depends: r-cran-lattice (>= 0.20) but it is not going to be installed
Depends: r-cran-ggplot2 but it is not going to be installed
Depends: r-cran-foreach but it is not going to be installed
Depends: r-cran-plyr but it is not going to be installed
Depends: r-cran-modelmetrics (>= 1.1.0) but it is not going to be installed
Depends: r-cran-nlme but it is not going to be installed
Depends: r-cran-reshape2 but it is not going to be installed
Depends: r-cran-recipes (>= 0.0.1) but it is not going to be installed
Depends: r-cran-withr (>= 2.0.0) but it is not going to be installed
r-cran-randomforest : Depends: r-api-3.4
E: Error, pkgProblemResolver::Resolve generated breaks, this may be caused by held packages.
The command '/bin/sh -c apt-get update && apt-get install -y --no-install-recommends r-cran-caret r-cran-randomforest' returned a non-zero code: 100
After 20 minutes of building, this minimal Dockerfile
:
FROM rocker/rstudio:3.6.2
RUN Rscript -e 'install.packages(c("botor", "randomForest", "caret"), repos="https://cloud.r-project.org")'
Was unable to install some dependencies :(
Warning messages:
1: In install.packages(c("botor", "randomForest", "caret"), repos = "https://cloud.r-project.org") :
installation of package ‘data.table’ had non-zero exit status
2: In install.packages(c("botor", "randomForest", "caret"), repos = "https://cloud.r-project.org") :
installation of package ‘ModelMetrics’ had non-zero exit status
3: In install.packages(c("botor", "randomForest", "caret"), repos = "https://cloud.r-project.org") :
installation of package ‘caret’ had non-zero exit status
Plus, it is not really cool that the build is successful although the installation inside failed.
(I am just throwing this out here: If you want prebuilt binaries for ease of install, you can start from another container, likely have easy installation of caret
et al and can add RStudio to it. See my initial blogpost (and video) here as well as a few follow-up posts on my blog. They demonstrate installation of the (whole) Tidyverse and RStan, respectively, in a single command with no surprises. In short, we can offer 'plug and play' but from a different starting point. The 'versioned' images are different, and we document why/how. If builds (from source) then fail you need to look into why and not just quote the highest level error (that is uninformative) back at us.)
I tried that route as well (using both plain debian or using miniconda) and in both cases I failed to install RStudio eventually... It's either I get the one but not the other.
I tried to build the following:
FROM rocker/rstudio:3.6.1
RUN Rscript -e 'install.packages(c("caret"), repos="https://cloud.r-project.org")'
and the most relevant error I can see is: ERROR: dependency ‘ModelMetrics’ is not available for package ‘caret’
. The log is very very long and I cannot scroll back so much.
@drorata I would recommended that you slow down a little. Nobody ever suggested conda packages would work with Rocker containers.
And randomly slaying commands together doesn't either. Caret, as we all know, is a META package pulling in dozesn of depends. You may need to check those individually. So fire up that RStudio 3.6.1 container by hand, get a shell or R prompt and try debugging why ModelMetrics
may not install. I have written multiple blog posts on this over the years, but I don't have time to repeat all this now; so maybe glance at the posts below this link.
We have a helpful (and very low-volume) mailing list r-sig-debian focussed on R on .deb based systems. You could ask there. The Rocker team does this work as volunteers, on top of lots of other activities. So please do not use the issue forum for more general R help.
@drorata apologies re the untested install on binaries, I had forgotten that the binaries built on bionic are pre R 3.5.0 release, and not compatible with the more recent ones. You could add Michael Rutter's PPAs if you want that to work:
apt install -y software-properties-common
add-apt-repository --enable-source --yes "ppa:marutter/c2d4u3.5"
add-apt-repository --enable-source --yes "ppa:marutter/rrutter3.5"
I'm a bit hesitant to build those into the versioned stack since I believe the PPA is periodically updated to match CRAN versions, while the versioned stack tries to promise repeatable builds.
You may be missing zlib
libs for data.table
or some other system dependency. Like Dirk says it'll be in listed deeper in your error logs. However, you'll also find the images higher up the versioned stack have more of this stuff built in, for instance:
docker run --rm -ti rocker/tidyverse Rscript -e 'install.packages(c("botor", "randomForest", "caret"), repos="https://cloud.r-project.org")'
takes about a minute and a half and succeeds :-)
@eddelbuettel I am sorry if my tune was of some high expectations from the Rocker team --- that's not the case and trust me that I can fully understand what it means "on top".
After spending few hours (mostly waiting for installs to complete) I became a little frustrated. Even the line suggested by @cboettig took 4 minutes for me on a MacBook with 2.2 GHz Quad-Core Intel Core i7
. I am hardly an R newbie, so I didn't know, for instance, that caret
is a meta package, and I don't even know exactly what it means.
Lastly, I tried to use the channels proposed by @cboettig using the following Dockerfile
:
FROM rockerdev/rstudio:3.6.3
RUN apt-get update \
&& apt install -y software-properties-common \
&& add-apt-repository --enable-source --yes "ppa:marutter/c2d4u3.5" \
&& add-apt-repository --enable-source --yes "ppa:marutter/rrutter3.5" \
&& apt-get install -y --no-install-recommends \
r-cran-caret \
r-cran-randomforest
RUN Rscript -e 'install.packages("botor", repos="https://cloud.r-project.org")'
# RUN Rscript -e 'reticulate::install_miniconda()'
This build was indeed successful, but when trying to use something from botor
, I was prompted to install miniconda
... It feels like a catch 22.
@drorata Well botor
depends on reticulate
which unless told otherwise defaults to miniconda
. A default design I can understand but I don't agree with, and (it so happens) both @cboettig and I let them know.
In short, you are asking for the entire candy store, and you are getting something designed that way by its author . And you are shooting the messenger again. (As for S3, I happen to be going the other way via their C++ API in another project but that is a different story,)
I'm really sorry if anyone got the feeling I'm shooting at someone. The only reason for that that I can imagine is my lack of understanding of how the R ecosystem works :(
I admire your energy and unihibited appetite. It's good. Don't let me stop you.
It's just that ... sometimes we need to decompose problems. And, if I may, shooting at the messenger does not help. We at Rocker have nothing to do with botor
or other packages you may like; if their installation has a large footprint (hello, caret
) and fails then I find it helps me to wrestle problems and bugs down to size. In doing so, you may well find a bug or suboptimal setting on our side, and if so, by all means report it here. In the meantime, I hope you enjoy Rocker and its container. They do help me in various settings.
I won't let @eddelbuettel stop me 😎
Probably I misunderstood the Rocker's agenda. For me, when I see a docker image, it is (almost by definition) a starting point for further customizations. In this case, I faced difficulties in customizing the image due to the terribly long building time of the customization. Not to mention the failures. I was under the impression that caret
is rather central package and was surprised it is not "supported" by the image I picked from the Rocker. If this understanding is wrong or doesn't align with the Rocker's agenda, then this whole thread is misplaced. I apologize for sense of a shooting range --- please believe me it was not my intention in any way. Lastly, I hope I'll be able to follow up on this thread sometime soon with a working (as per my needs) Dockerfile
. Cheers!
You continue to gloss over context, and make unwarranted assumptions. So no, not every container is meant first and foremost as a dev platform and springboard. Read our R Journal paper on that.
If I start e.g. from rocker/r-ubuntu:18.04
, a container explicitly designed to take advantage of the 4000+ r-cran-*
binaries curated by Michael Rutter for Ubuntu LTS releases, then apt update
followed by apt install r-cran-caret
works in one shot and requires no compilation because I chose a path with binaries:.
root@72f2a076d3aa:/work# apt install r-cran-caret
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
r-cran-assertthat r-cran-backports r-cran-bh r-cran-callr r-cran-cli
r-cran-colorspace r-cran-crayon r-cran-data.table r-cran-desc
r-cran-digest r-cran-dplyr r-cran-ellipsis r-cran-evaluate
r-cran-fansi r-cran-farver r-cran-foreach r-cran-gdtools
r-cran-generics r-cran-ggplot2 r-cran-glue r-cran-gower
r-cran-gtable r-cran-ipred r-cran-isoband r-cran-iterators
r-cran-labeling r-cran-lava r-cran-lifecycle r-cran-lubridate
r-cran-magrittr r-cran-modelmetrics r-cran-munsell r-cran-numderiv
r-cran-pillar r-cran-pkgbuild r-cran-pkgconfig r-cran-pkgload
r-cran-plogr r-cran-plyr r-cran-praise r-cran-prettyunits
r-cran-proc r-cran-processx r-cran-prodlim r-cran-ps r-cran-purrr
r-cran-r6 r-cran-rcolorbrewer r-cran-rcpp r-cran-recipes
r-cran-reshape2 r-cran-rlang r-cran-rprojroot r-cran-rstudioapi
r-cran-scales r-cran-squarem r-cran-stringi r-cran-stringr
r-cran-svglite r-cran-systemfonts r-cran-testthat r-cran-tibble
r-cran-tidyr r-cran-tidyselect r-cran-timedate r-cran-utf8
r-cran-vctrs r-cran-viridislite r-cran-withr
Suggested packages:
r-cran-knitr
The following NEW packages will be installed:
r-cran-assertthat r-cran-backports r-cran-bh r-cran-callr
r-cran-caret r-cran-cli r-cran-colorspace r-cran-crayon
r-cran-data.table r-cran-desc r-cran-digest r-cran-dplyr
r-cran-ellipsis r-cran-evaluate r-cran-fansi r-cran-farver
r-cran-foreach r-cran-gdtools r-cran-generics r-cran-ggplot2
r-cran-glue r-cran-gower r-cran-gtable r-cran-ipred r-cran-isoband
r-cran-iterators r-cran-labeling r-cran-lava r-cran-lifecycle
r-cran-lubridate r-cran-magrittr r-cran-modelmetrics r-cran-munsell
r-cran-numderiv r-cran-pillar r-cran-pkgbuild r-cran-pkgconfig
r-cran-pkgload r-cran-plogr r-cran-plyr r-cran-praise
r-cran-prettyunits r-cran-proc r-cran-processx r-cran-prodlim
r-cran-ps r-cran-purrr r-cran-r6 r-cran-rcolorbrewer r-cran-rcpp
r-cran-recipes r-cran-reshape2 r-cran-rlang r-cran-rprojroot
r-cran-rstudioapi r-cran-scales r-cran-squarem r-cran-stringi
r-cran-stringr r-cran-svglite r-cran-systemfonts r-cran-testthat
r-cran-tibble r-cran-tidyr r-cran-tidyselect r-cran-timedate
r-cran-utf8 r-cran-vctrs r-cran-viridislite r-cran-withr
0 upgraded, 70 newly installed, 0 to remove and 91 not upgraded.
Need to get 49.6 MB of archives.
After this operation, 189 MB of additional disk space will be used.
Do you want to continue? [Y/n]
You can then add the rstudio deb 'by hand' in a Dockerfile.
Can @eddelbuettel link to the mentioned paper?
I'll try to use rocker/r-ubuntu:18.04
as my starting point. Thanks for the pointer.
@drorata As I mentioned above, you could also have used rocker/tidyverse
as a starting point:
docker run --rm -ti rocker/tidyverse Rscript -e 'install.packages(c("botor", "randomForest", "caret"), repos="https://cloud.r-project.org")'
instead of rocker/rstudio
. This takes only a minute or so to build and will have RStudio installed if you need that. If you want binary installs on Ubuntu, Dirk's suggestion is best. The paper Dirk mention discusses the pros and cons of both of these approaches: https://journal.r-project.org/archive/2017/RJ-2017-065/index.html
I'm trying to add some packages to an image built on top of
rocker/rstudio
:This seems to work nicely except that it takes forever! So I tried this:
However, with this approach
botor
is installed, but the other two when I try to load the are not to be found. I guess I'm missing something about the way R loads its packages.