rstudio / packrat

Packrat is a dependency management system for R
http://rstudio.github.io/packrat/
402 stars 90 forks source link

`Install.packages()` fails when using packrat with RSPM package repo: incomplete block on file #717

Closed Giqles closed 1 year ago

Giqles commented 1 year ago

I'm trying to use packrat on Amazon linux, with the RSPM centos7 package repository. When I try this installing any package fails, seemingly at the point where the download of the package initialises:

Error in untar2(tarfile, files, list, exdir, restore_times) : 
  incomplete block on file

Packrat seems to be correctly picking up the repository settings from my user-level ~/.Rprofile. See example below:

Packrat mode on. Using library in directory:
- "/local/home/apsg/example_project/packrat/lib"
R version 4.1.3 (2022-03-10) -- "One Push-Up"
Platform: x86_64-pc-linux-gnu (64-bit)

r$> packrat::repos_list()
                                                             RSPM 
"https://packagemanager.rstudio.com/all/__linux__/centos7/latest" 

r$> install.packages("data.table")
Installing package into ‘/local/home/apsg/example_project/packrat/lib/x86_64-pc-linux-gnu/4.1.3’
(as ‘lib’ is unspecified)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   151  100   151    0     0   2171      0 --:--:-- --:--:-- --:--:--  2188
Error in untar2(tarfile, files, list, exdir, restore_times) : 
  incomplete block on file

The downloaded source packages are in
        ‘/tmp/RtmpZRKz76/downloaded_packages’
Warning message:
In install.packages("data.table") :
  installation of package ‘data.table’ had non-zero exit status

What am I doing wrong?

aronatkins commented 1 year ago

I suspect that the download is receiving a redirect, which is not automatically handled.

The Package Manager URL that is eventually used is:

"https://packagemanager.posit.co/cran/__linux__/centos7/latest/src/contrib/data.table_1.14.8.tar.gz?r_version=4.1&arch=x86_64"

That redirects to:

https://rspm-sync.rstudio.com/bin/4.1-centos7/299a4cf0f689778120a95b7111a850330ec9c1d0604aba7bca25ac408133662b.tar.gz

Could you try:

options(download.file.extra = "-L")

This causes curl to follow redirects.

Giqles commented 1 year ago

Thanks, that helped, in that now the packages are installing -- but currently they seem to be downloading/installing as source packages rather than the binaries. I'm not really sure why that would be happening! I'll try and investigate a bit further.

aronatkins commented 1 year ago

Package Manager looks at the HTTP user agent header when determining if a binary package archive should be returned. Your R session is probably not providing this HTTP header.

https://docs.posit.co/rspm/admin/serving-binaries/#binary-user-agents

Packrat does not add this user agent by default.

Some R distributions, like https://github.com/rstudio/r-builds (used by https://github.com/rstudio/r-docker) set the user agent, and it is likely also set by some versions of the RStudio IDE.

You should also know that renv handles most of these interactions automatically -- and dynamically determines the correct Package Manager URL for your current Linux distribution. If you are starting new projects, we recommend that you try using renv rather than packrat. https://github.com/rstudio/renv

Giqles commented 1 year ago

Thanks that's very helpful -- I will use renv instead.