rocker-org / rocker

R configurations for Docker
https://rocker-project.org
GNU General Public License v2.0
1.45k stars 273 forks source link

piggy-back on RSPM's system dependency data base #404

Closed maxheld83 closed 3 years ago

maxheld83 commented 4 years ago

For every supported distribution (say, bionic), RSPM provides a list (in this case) apt-get installs to run, such as:

# rkafka requirements:
apt-get install -y default-jdk
R CMD javareconf

# rriskDistributions requirements:
apt-get install -y tcl
apt-get install -y tk
apt-get install -y tk-table

... (it's a pretty long list).

This list is based on the open sourced https://github.com/rstudio/r-system-requirements.

The proprietary RSPM CLI also has a command to filter out the install commands necessary for any given set of packages (via the DESCRIPTIONs SystemRequirements field). I am guessing that it shouldn't be too hard to reimplement this query using the open-sourced .jsons in https://github.com/rstudio/r-system-requirements. (I was considering to put this into a small thin helper package, as an alternative to Gabors http://github.com/r-hub/sysreqs).

I'm not sure what the philosophy for system dependencies on rocker images is, but was wondering:

System dependencies are frustrating enough, and it's been driving me a little insane the last couple of years that there were (at least) three overlapping approaches to this problem between https://github.com/rstudio/r-system-requirements, http://github.com/r-hub/sysreqs and https://github.com/rstudio/shinyapps-package-dependencies.

eddelbuettel commented 4 years ago

That needs thinking. OTOH apt-get install r-cran-rjava is still better as it is a) in the distro and hence tested and b) automagically pulls in the depends.

But the sysreqs db is good, and we had meant to use it in the ongoing cran2deb and other endeavours. It just means ... someone needs to rewrite a lot of logic and implement it.

I am not sure yet the pain / gain ratio works for Rocker. But we can probably "easily enough" do some lookups.

maxheld83 commented 4 years ago

It just means ... someone needs to rewrite a lot of logic and implement it.

Do you mean the logic to parse/wrangle/subset the *.jsons (~ reimplement rspm list requirements), or something else?

(Sorry updated my above comment for clarify).

eddelbuettel commented 4 years ago

I meant "everything". It fundamentally changes how these script work because it changes things at the core.

Projects like c4d2u (and related) have their own dbs for that; you can generally just go into a fully-fleshed out system (that has grown organically over a decade, albeit slowly) and rip one piece out, just how it is hard to change a plane engine mid-flight,,,

That said, this is a useful backend (and eg Michael, Gabor, Don and I mentioned it as something to use years ago in the grant application to the ISC when we got funding to build this. Which was later withdrawn, just how one earlier application got "lost". But all that is for another time...

So in short we probably want to create a lookup tool to complement/replace our db. We could then try to inject that lookup. (And the current sequence of individual apt-get calls is poor, but then again they also have a solution here that is cross-distro which is actually novel and good.)

cboettig commented 4 years ago

I believe the CRAN team maintains a debian meta package which depends on all (debian system requirements) packages that are installed on the CRAN check servers, http://statmath.wu.ac.at/AASC/debian/. I think installing that metapackage will pull in all system dependencies for all packages on CRAN, which is the approach taken in https://hub.docker.com/r/cran/debian

eddelbuettel commented 4 years ago

I am not sure how alive AASC is. I just had emails with KH this week about him bemoaning the outdated pandoc binary in Debian. I brought AASC in passing, I was not left with the impression it was super active. I could of course be wrong... @jeroen may know better.

cboettig commented 4 years ago

Good call, wondered about that.

Also, I believe both https://github.com/o2r-project/containerit and https://github.com/karthik/holepunch already run the r-hub sysreqs approach to automatically add system dependencies to Rocker images as well (@karthik and @nuest can comment more.

eddelbuettel commented 4 years ago

GitHub Actions uses it too.

But I fear we are getting sidetracked. What if any is the Rocker question? Besides everybody wishing for a pony and magically rebuilt system?

eddelbuettel commented 4 years ago

@maxheld83 Timely. I put a first container together, not too poetically named rocker/r-rspm with a first tag 18.04. You can pull it. It is plain Ubuntu 18.04, plus 'our' R package plus the two settiings for RSPM.

It faithfully and quickly installs packages from binaries. It does not install required system dependendencies:

edd@rob:~$ docker run --rm -ti rocker/r-rspm:18.04 bash
root@515294f794a2:/# Rscript -e 'install.packages("xml2"); library(xml2)'
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'https://packagemanager.rstudio.com/all/__linux__/bionic/latest/src/contrib/xml2_1.3.2.tar.gz'
Content type 'binary/octet-stream' length 521938 bytes (509 KB)
==================================================
downloaded 509 KB

* installing *binary* package ‘xml2’ ...
* DONE (xml2)

The downloaded source packages are in
        ‘/tmp/RtmpQGkD1Q/downloaded_packages’
Error: package or namespace load failed for ‘xml2’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/usr/local/lib/R/site-library/xml2/libs/xml2.so':
  libxml2.so.2: cannot open shared object file: No such file or directory
Execution halted
root@515294f794a2:/# 

So yes, let's hack that tool :)

cboettig commented 3 years ago

:broom: