rstudio / renv

renv: Project environments for R.
https://rstudio.github.io/renv/
MIT License
1.02k stars 155 forks source link

Cannot install bioconductor Rhtslib using Dockerfile and renv - without renv works fine #1957

Open fdekievit opened 4 months ago

fdekievit commented 4 months ago

I'm trying to build a renv.lock file using a dockerfile (so it can run in a CI environment).

As a Dockerfile I've created the following:

FROM --platform=linux/amd64 bioconductor/bioconductor_docker:RELEASE_3_19-R-4.4.1

# set library inside container
RUN mkdir -p /renv/library
ENV RENV_PATHS_LIBRARY /renv/library

# change default location of cache to project folder
RUN mkdir -p /renv/.cache
ENV RENV_PATHS_CACHE /renv/.cache

# install Renv
RUN R -e "install.packages('renv', version='1.0.7', dependencies=TRUE, repos='http://cran.rstudio.com/')"

# init renv
RUN R -e "library('renv'); renv::init()"

This part runs fine. However, the problem starts when adding Rhtslib (which is required for the bioconductor packages i actually need).

The RENV docs tell me that i can install bioconductor packages using the https://rstudio.github.io/renv/reference/install.html#bioconductor format.

So I've tried adding this in the installer by adding this to the Dockerfile following the syntax as described by the docs:

RUN R -e "library('renv'); renv::install('bioc::Rhtslib')"

This will crash the build, ending with a:

#9 49.92 > library('renv'); renv::install('bioc::Rhtslib')
#9 49.95 
#9 49.95 Attaching package: ‘renv’
#9 49.95 
#9 49.95 The following objects are masked from ‘package:stats’:
#9 49.95 
#9 49.95     embed, update
#9 49.95 
#9 49.95 The following objects are masked from ‘package:utils’:
#9 49.95 
#9 49.95     history, upgrade
#9 49.95 
#9 49.95 The following objects are masked from ‘package:base’:
#9 49.95 
#9 49.95     autoload, load, remove, use
#9 49.95 
#9 55.39 # Downloading packages -------------------------------------------------------
#9 56.33 - Downloading Rhtslib from BioCcontainers ...   OK [6.8 Mb in 0.62s]
#9 57.11 /usr/bin/tar: Unexpected EOF in archive
#9 57.12 /usr/bin/tar: Error is not recoverable: exiting now
#9 57.29 /usr/bin/tar xf '/root/.cache/R/renv/source/repository/Rhtslib/Rhtslib_3.0.0.tar.gz' -C '/tmp/RtmpWMgcFb/renv-description-77e9f8ada' 'Rhtslib/DESCRIPTION'
#9 57.29 ================================================================================
#9 57.29 
#9 57.29 /usr/bin/tar: Unexpected EOF in archive
#9 57.29 /usr/bin/tar: Error is not recoverable: exiting now
#9 57.29 
#9 57.29 Error: error decompressing archive [error code 2]
#9 57.30 Traceback (most recent calls last):
#9 57.30 18: renv::install("bioc::Rhtslib")
#9 57.30 17: retrieve(packages)
#9 57.30 16: handler(package, renv_retrieve_impl(package))
#9 57.30 15: renv_retrieve_impl(package)
#9 57.30 14: renv_retrieve_bioconductor(record)
#9 57.30 13: renv_retrieve_repos(record)
#9 57.30 12: renv_retrieve_repos_impl(record)
#9 57.30 11: renv_retrieve_package(record, url, path)
#9 57.30 10: renv_retrieve_successful(record, path)
#9 57.30  9: renv_description_read(path, subdir = subdir)
#9 57.30  8: filebacked(context = "renv_description_read", path = path, callback = renv_description_read_impl, 
#9 57.30         subdir = subdir, ...)
#9 57.30  7: callback(path, ...)
#9 57.30  6: renv_archive_decompress(path, files = file, exdir = exdir)
#9 57.30  5: renv_archive_decompress_tar(archive, files = files, exdir = exdir, 
#9 57.30         ...)
#9 57.30  4: renv_tar_decompress(tar, archive = archive, files = files, exdir = exdir, 
#9 57.30         ...)
#9 57.30  3: renv_system_exec(tar, args, action = "decompressing archive")
#9 57.30  2: abort(sprintf("error %s [error code %i]", action, status), body = renv_system_exec_details(command, 
#9 57.30         args, output))
#9 57.30  1: stop(fallback)
#9 57.30 Execution halted
#9 ERROR: process "/bin/sh -c R -e \"library('renv'); renv::install('bioc::Rhtslib')\"" did not complete successfully: exit code: 1
------
 > [6/6] RUN R -e "library('renv'); renv::install('bioc::Rhtslib')":
57.30  6: renv_archive_decompress(path, files = file, exdir = exdir)
57.30  5: renv_archive_decompress_tar(archive, files = files, exdir = exdir, 
57.30         ...)
57.30  4: renv_tar_decompress(tar, archive = archive, files = files, exdir = exdir, 
57.30         ...)
57.30  3: renv_system_exec(tar, args, action = "decompressing archive")
57.30  2: abort(sprintf("error %s [error code %i]", action, status), body = renv_system_exec_details(command, 
57.30         args, output))
57.30  1: stop(fallback)
57.30 Execution halted
------

However, if i run the script without the renv method like so:

RUN R -e "library('BiocManager'); BiocManager::install('Rhtslib')"

It runs just fine.

However, my understanding is that installing it this way wont add this package to the renv.lock file, so this is not what i want.

Is there a reason why this isnt installing? Is it RENV that cant handle compressed files or something? Or am i using incorrect syntax?

If none of the above it might be a bug.

As summary the full script with both the working and broken line commented out:

FROM --platform=linux/amd64 bioconductor/bioconductor_docker:RELEASE_3_19-R-4.4.1

# set library inside container
RUN mkdir -p /renv/library
ENV RENV_PATHS_LIBRARY /renv/library

# change default location of cache to project folder
RUN mkdir -p /renv/.cache
ENV RENV_PATHS_CACHE /renv/.cache

# install Renv
RUN R -e "install.packages('renv', version='1.0.7', dependencies=TRUE, repos='http://cran.rstudio.com/')"

# init renv
RUN R -e "library('renv'); renv::init()"

# This line doesnt work:
#RUN R -e "library('BiocManager'); library('renv'); renv::install('bioc::Rhtslib')"

# This line works:
#RUN R -e "library('BiocManager'); BiocManager::install('Rhtslib')"
lauratwomey commented 3 months ago

I had the exact same issue, same error, but this line you proposed worked for me! RUN R -e "library('BiocManager'); library('renv'); renv::install('bioc::Rhtslib')"

My Dockerfile is similar to yours up to the line "install Renv", after that I have:

# Install renv
RUN R -e "install.packages('renv', repos = c(CRAN = 'https://cloud.r-project.org'))"

# Copy a renv lock file which already has some packages. this doesn't affect the next steps
COPY renv.lock renv.lock

RUN R -e "renv::init()"

RUN R -e "library('BiocManager'); library('renv'); renv::install('bioc::Rhtslib')"
RUN R -e "renv::install('Seurat@5.1.0')"
RUN R -e "renv::install('chris-mcginnis-ucsf/DoubletFinder')"
RUN R -e "renv::install('bioc::scDblFinder')" # this was the package that was giving me issues with Rhtslib

Hope this helps, good luck! (And thanks for the solution!)

kevinushey commented 3 months ago

Thanks for the bug report. Very strange! I'm able to reproduce using your Dockerfile, but the issue seems to be independent of renv:

root@33403b8d6088:/project# R

R version 4.4.1 (2024-06-14) -- "Race for Your Life"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

- Project '/project' loaded. [renv 1.0.7]
> options(repos = BiocManager::repositories())
'getOption("repos")' replaces Bioconductor standard repositories, see
'help("repositories", package = "BiocManager")' for details.
Replacement repositories:
    CRAN: https://p3m.dev/cran/__linux__/jammy/latest
> dl <- download.packages("Rhtslib", destdir = getwd())
trying URL 'https://bioconductor.org/packages/3.19/container-binaries/bioconductor_docker/src/contrib/Rhtslib_3.0.0_R_x86_64-pc-linux-gnu.tar.gz'
Content type 'application/gzip' length 7166712 bytes (6.8 MB)
==================================================
downloaded 6.8 MB

> system2("/usr/bin/tar", c("xf", dl[1, 2], "-C", getwd(), "Rhtslib/DESCRIPTION"))
/usr/bin/tar: Unexpected EOF in archive
/usr/bin/tar: Error is not recoverable: exiting now

I would recommend contacting the Bioconductor team; it looks to me like there's some issue in the Rhtslib package available on Bioconductor.

In case it's relevant, I reproduced this issue running on a macOS machine (M1 processor), and so the Docker container was running with Apple's Rosetta emulation.

fdekievit commented 3 months ago

Thanks for the bug report. Very strange! I'm able to reproduce using your Dockerfile, but the issue seems to be independent of renv:

root@33403b8d6088:/project# R

R version 4.4.1 (2024-06-14) -- "Race for Your Life"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

- Project '/project' loaded. [renv 1.0.7]
> options(repos = BiocManager::repositories())
'getOption("repos")' replaces Bioconductor standard repositories, see
'help("repositories", package = "BiocManager")' for details.
Replacement repositories:
    CRAN: https://p3m.dev/cran/__linux__/jammy/latest
> dl <- download.packages("Rhtslib", destdir = getwd())
trying URL 'https://bioconductor.org/packages/3.19/container-binaries/bioconductor_docker/src/contrib/Rhtslib_3.0.0_R_x86_64-pc-linux-gnu.tar.gz'
Content type 'application/gzip' length 7166712 bytes (6.8 MB)
==================================================
downloaded 6.8 MB

> system2("/usr/bin/tar", c("xf", dl[1, 2], "-C", getwd(), "Rhtslib/DESCRIPTION"))
/usr/bin/tar: Unexpected EOF in archive
/usr/bin/tar: Error is not recoverable: exiting now

I would recommend contacting the Bioconductor team; it looks to me like there's some issue in the Rhtslib package available on Bioconductor.

In case it's relevant, I reproduced this issue running on a macOS machine (M1 processor), and so the Docker container was running with Apple's Rosetta emulation.

hmm. I am also running on a Mac, perhaps that's relevant. I'll create an issue on the Bioconductor page.