rocker-org / rocker

R configurations for Docker
https://rocker-project.org
GNU General Public License v2.0
1.45k stars 273 forks source link

Docker image r-base:4.2.1 not reproducible #540

Closed JackCaster closed 6 months ago

JackCaster commented 6 months ago

I am building a Docker image based on the old r-base:4.2.1, but when running

RUN R -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))"

I get the error:

 => => # /usr/bin/R: line 193: /bin/sed: No such file or directory                                                                                                                                                                                            
 => => # ERROR: option '-e' requires a non-empty argument                

The problem is that I used r-base:4.2.1 in the past without this issue. I have one container running that I built in the past that works just fine.

Now, the issue can be solved by adding a simlink:

RUN ln -s /usr/bin/sed /bin/sed

but this is not something I had to do in the past, which suggests that the Docker image is not reproducible (I thought Docker images were!). Could it be that r-base did not pin its own base Debian image?

eddelbuettel commented 6 months ago

Could it be that r-base did not pin its own base Debian image?

These are not pinned. If you want versioned Docker archives, look at rocker-versioned2.

R itself works off 'head', and always-rolling always-current repo (CRAN) expecting / working with the current verison of R. If you want to build a container off rocker/r-base, it is normal use 'latest'. That works (using just r-base here):

edd@rob:~$ docker run --rm -ti r-base R -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))"

R version 4.3.2 (2023-10-31) -- "Eye Holes"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'https://cloud.r-project.org/src/contrib/remotes_2.4.2.1.tar.gz'
Content type 'application/x-gzip' length 152560 bytes (148 KB)
==================================================
downloaded 148 KB

* installing *source* package ‘remotes’ ...
** package ‘remotes’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (remotes)

The downloaded source packages are in
        ‘/tmp/RtmpSzr1KS/downloaded_packages’
> 
> 
edd@rob:~$ 

Moreover, calling just install.r remotes is what we do in other contaienrs and easier, and the cloud CDN is already set.

edd@rob:~$ docker run --rm -ti r-base install.r remotes
trying URL 'https://cloud.r-project.org/src/contrib/remotes_2.4.2.1.tar.gz'
Content type 'application/x-gzip' length 152560 bytes (148 KB)
==================================================
downloaded 148 KB

* installing *source* package ‘remotes’ ...
** package ‘remotes’ successfully unpacked and MD5 sums checked
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (remotes)

The downloaded source packages are in
        ‘/tmp/downloaded_packages’
edd@rob:~$ 
JackCaster commented 6 months ago

Argh, I just realized that I was installing sed again during some apt installations. That may have overwritten the one already installed. Thank you for the additional info though!

eddelbuettel commented 6 months ago

That /bin <-> /usr/bin transition has bitten me too and can create havoc here.

My personal $0.02 is that the whole notion of 'today' pretending that it was '1 1/2 years ago' and I could work as if 4.2.1 is current is very very fraught as the world around does not stand still. Rocker's versioned2 stack also freeze the CRAN repo (via date snapshots at p3m, before that MRAN) so that may help ...

eddelbuettel commented 6 months ago

And FYI you can prevent upgrades of packages under apt and dpkg via "hold" commands. That is sometimes useful when a 'bad' or 'new' library accidentally slips in though Debian does a very good with library transitions preventing that sort of thing.