ministryofjustice / analytics-platform

Parent repository for the MOJ Analytics Platform
MIT License
14 stars 1 forks source link

Improve speed of packrat installs #3

Closed RobinL closed 4 years ago

RobinL commented 7 years ago

When cloning a project with a large number of dependencies, the platform can freeze up for quite some time whilst everything is compiled.

RobinL commented 7 years ago

I've been looking at various resources to try and understand how to improve this:

https://github.com/rstudio/packrat/issues/212

https://github.com/RobinL/cheatsheets_etc/blob/master/packrat_efficiency.md

https://github.com/rstudio/packrat/blob/master/R/cache.R

https://groups.google.com/forum/#!forum/packrat-discuss

The best solution I've come up with for R SHiny Docker is something like this:

# Set up another folder, create a packrat project 
# and install a bunch of default packages
WORKDIR /usr/my_project

# libraries includes one or more r files containing library(dplyr) type statements
ADD libraries .

RUN R -e "install.packages('packrat');  packrat::init(options=list(use.cache = TRUE));"
WORKDIR /srv/shiny-server

ADD packrat packrat

# Install some common R packages to prevent them having to be changed every time packrat.lock changes

RUN R -e  "packrat::set_opts(use.cache=TRUE); packrat::restore();"
xoen commented 7 years ago

Linux CRAN: https://cran.r-project.org/bin/linux/debian/

There are i386 and amd64 binaries for jessie and wheezy

rocker/shiny is based on r-base which is based on debian:testing.

Debian testing codename should be buster: https://wiki.debian.org/DebianReleases#Current_Releases.2FRepositories

RobinL commented 7 years ago

There's a chap called Dirk who's been maintaining precompiled Debian packages since 2001.

I found a Stackoverflow he replied to back in 2012 and asked a further question, which he responded to:

Yes, you can. There used to be a service for that (google "cran2deb"), and we are trying to rebuild one. It won't be ready "soon" though. But you can very much proxy it locally. – Dirk Eddelbuettel 4 mins ago

This confirms that we can set up a local CRAN. This is almost certainly the best/recommended approach (Dirk seems to be the expert here) - so looks like we need to investigate how cran2deb works.

(Note Debian is the base for rocker/shiny - specifically debian:testing) Also note from the rocker/r-base docker file:

MAINTAINER "Carl Boettiger and Dirk Eddelbuettel" rocker-maintainers@eddelbuettel.com

What a dude!

r4vi commented 5 years ago

A potential fix for this is: https://github.com/ministryofjustice/analytics-platform-cran-proxy

davidread commented 4 years ago

We changed from packrat to conda, so this is no longer an issue