rstudio / renv

renv: Project environments for R.
https://rstudio.github.io/renv/
MIT License
998 stars 152 forks source link

Separate RENV Library Between Host and Docker -- Empty Docker RENV Library? #1602

Closed emstruong closed 1 year ago

emstruong commented 1 year ago

Hello,

I was looking through the renv documentation and other guides, and I've hit a hurdle that I'm having trouble getting through.

I've put my dockerfile here and attached my renv.lock

The context is that I have a RStudio project locally with its own renv library that I'd like to transition to a Docker image. However, I'd still like the local renv library to be accessible and I want it to be separate from the Docker image's renv library to ensure computation reproducibility.

However, although my current Dockerfile seems to be install the packages, it installs it into /usr/local/lib/R/site-library and /usr/local/lib/R/library instead of ${PROJDIRECTORY}/renv/library ... and I can't figure out why... Would greatly appreciate any help

# Source by Peter Solymos: https://hosting.analythium.io/best-practices-for-r-with-docker/
# Source by Nathaniel Haines: http://haines-lab.com/post/2022-01-23-automating-computational-reproducibility-with-r-using-renv-docker-and-github-actions/
# Source by Bill Mills: https://github.com/BillMills/Rocker-tutorial
# Source by Posit: https://solutions.posit.co/envs-pkgs/environments/docker/
# Source by renv vignette: https://rstudio.github.io/renv/articles/docker.html
FROM rocker/rstudio:4.3.1

RUN apt-get update -qq && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y \
    apt-transport-https \
    build-essential \
    curl \
    gfortran \
    libatlas-base-dev \
    libbz2-dev \
    libcairo2 \
    libcurl4-openssl-dev \
    libicu-dev \
    liblzma-dev \
    libpango-1.0-0 \
    libpangocairo-1.0-0 \
    libpcre3-dev \
    libtcl8.6 \
    libtiff5 \
    libtk8.6 \
    libx11-6 \
    libxt6 \
    locales \
    tzdata \
    zlib1g-dev

RUN apt-get install -y \
    cmake \
    make \
    libcurl4-openssl-dev \
    libssl-dev \
    pandoc \
    libxml2-dev 

ENV RENV_VERSION 'v1.0.0'

RUN Rscript -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))"
RUN Rscript -e "remotes::install_github('rstudio/renv@${RENV_VERSION}')"

## Doing renv restore
ENV PROJDIRECTORY "/home/rstudio/Documents/RStudio/PROJECT"
WORKDIR ${PROJDIRECTORY}
COPY renv.lock renv.lock

RUN mkdir -p  renv/library
ENV RENV_PATHS_LIBRARY '${PROJDIRECTORY}/renv/library'

RUN Rscript -e "renv::restore()"

This is what I run to build the image at the root of the R Studio project

sudo docker build -t project . 

This is what I run to start the image at the root of the R Studio project

# Source by Peter Solymos: https://hosting.analythium.io/best-practices-for-r-with-docker/
# Source by Nathaniel Haines: http://haines-lab.com/post/2022-01-23-automating-computational-reproducibility-with-r-using-renv-docker-and-github-actions/
# Source by Bill Mills: https://github.com/BillMills/Rocker-tutorial 
# Source by James Walker: https://www.howtogeek.com/devops/how-to-mount-a-docker-volume-while-excluding-a-subdirectory/

sudo docker run -dp 8787:8787 -e PASSWORD=password \
  -v .:/home/rstudio/Documents/RStudio/PROJECT \
  -v /home/rstudio/Documents/RStudio/PROJECT/renv \
  project
kevinushey commented 1 year ago

If I'm reading https://docs.docker.com/engine/reference/builder/#env correctly, environment variables should be declared as:

ENV key=value

That is, with an = to indicate the value you want to assign. Could that be the reason why?

emstruong commented 1 year ago

That is, with an = to indicate the value you want to assign. Could that be the reason why?

Still the same even with = unfortunately...

kevinushey commented 1 year ago

Ah, it's because you're using single quotes here:

ENV RENV_PATHS_LIBRARY '${PROJDIRECTORY}/renv/library'

Docker doesn't expand variables within single quotes. You need to use double quotes.

emstruong commented 1 year ago

Ah, it's because you're using single quotes here:

ENV RENV_PATHS_LIBRARY '${PROJDIRECTORY}/renv/library'

Docker doesn't expand variables within single quotes. You need to use double quotes.

This is what I have now and it's still not working... Could you share your docker file?

# Source by Peter Solymos: https://hosting.analythium.io/best-practices-for-r-with-docker/
# Source by Nathaniel Haines: http://haines-lab.com/post/2022-01-23-automating-computational-reproducibility-with-r-using-renv-docker-and-github-actions/
# Source by Bill Mills: https://github.com/BillMills/Rocker-tutorial
# Source by Posit: https://solutions.posit.co/envs-pkgs/environments/docker/
# Source by renv vignette: https://rstudio.github.io/renv/articles/docker.html
FROM rocker/rstudio:4.3.1

RUN apt-get update -qq && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y \
    apt-transport-https \
    build-essential \
    curl \
    gfortran \
    libatlas-base-dev \
    libbz2-dev \
    libcairo2 \
    libcurl4-openssl-dev \
    libicu-dev \
    liblzma-dev \
    libpango-1.0-0 \
    libpangocairo-1.0-0 \
    libpcre3-dev \
    libtcl8.6 \
    libtiff5 \
    libtk8.6 \
    libx11-6 \
    libxt6 \
    locales \
    tzdata \
    zlib1g-dev

RUN apt-get install -y \
    cmake \
    make \
    libcurl4-openssl-dev \
    libssl-dev \
    pandoc \
    libxml2-dev 

ENV RENV_VERSION 'v1.0.0'

RUN Rscript -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))"
RUN Rscript -e "remotes::install_github('rstudio/renv@${RENV_VERSION}')"

## Doing renv restore
ENV PROJDIRECTORY "/home/rstudio/Documents/RStudio/PROJECT"
WORKDIR ${PROJDIRECTORY}
COPY renv.lock renv.lock

RUN mkdir -p  renv/library
ENV RENV_PATHS_LIBRARY "${PROJDIRECTORY}/renv/library"

RUN Rscript -e "renv::restore()"

As far as I can tell, /home/rstudio/Documents/RStudio/PROJECT/renv/library is still empty and the libraries are installing elsewhere. In principle, if I set RENV_PATHS_LIBRARY to /home/rstudio/Documents/RStudio/PROJECT/renv/library, should all the packages be installed there? When I look at the documentation for renv::restore() the library argument says to look at Library for details, but I'm guessing that part of the documentation changed somewhere?

emstruong commented 1 year ago

For anyone interested in the future, I think I've figured out what the issue is -- it's probably specific to the use of rocker/rstudio, which has one the root user and the rstudio user.

Depending on how you do it, installing libraries during the docker build process puts the libraries in a place that you may be able to call with library(), but that you won't be able to actually see in renv/library because it's under /root/.cache/.... My solution to this was to switch to the rstudio user when calling renv::restore() and to switch back to root for the server to run. This is the dockerfile I've settled on and there are no changes for how I start up the container so that the container's renv is used instead of my local copy.

# Source by Peter Solymos: https://hosting.analythium.io/best-practices-for-r-with-docker/
# Source by Nathaniel Haines: http://haines-lab.com/post/2022-01-23-automating-computational-reproducibility-with-r-using-renv-docker-and-github-actions/
# Source by Bill Mills: https://github.com/BillMills/Rocker-tutorial
# Source by Posit: https://solutions.posit.co/envs-pkgs/environments/docker/
# Source by renv vignette: https://rstudio.github.io/renv/articles/docker.html
FROM rocker/rstudio:4.3.1

RUN apt-get update -qq && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y \
    apt-transport-https \
    build-essential \
    curl \
    gfortran \
    libatlas-base-dev \
    libbz2-dev \
    libcairo2 \
    libcurl4-openssl-dev \
    libicu-dev \
    liblzma-dev \
    libpango-1.0-0 \
    libpangocairo-1.0-0 \
    libpcre3-dev \
    libtcl8.6 \
    libtiff5 \
    libtk8.6 \
    libx11-6 \
    libxt6 \
    locales \
    tzdata \
    zlib1g-dev

RUN apt-get install -y \
    cmake \
    make \
    libcurl4-openssl-dev \
    libssl-dev \
    pandoc \
    libxml2-dev 

USER rstudio

ENV RENV_VERSION 'v1.0.0'

RUN Rscript -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))"
RUN Rscript -e "remotes::install_github('rstudio/renv@${RENV_VERSION}')"

## Doing renv restore
ENV PROJDIRECTORY "/home/rstudio/Documents/RStudio/PROJECT"
WORKDIR ${PROJDIRECTORY}
COPY renv.lock renv.lock

RUN mkdir -p  renv/library

COPY .Rprofile .Rprofile
COPY renv/activate.R renv/activate.R

RUN R -e "renv::restore()"

USER root

Something to note is that this current version of the Dockerfile may not be computationally reproducible, strictly speaking, one of the biggest reasons being the use of the apt-get update. Furthermore, using docker-compose and other things to make the container match your local R Studio profile could make the experience better.