r-lib / pkgdepends

R Package Dependency Resolution
https://r-lib.github.io/pkgdepends/
Other
100 stars 31 forks source link

System requirements do not resolve if `repos` does not have `RSPM` (the exact key) #320

Open maximsmol opened 1 year ago

maximsmol commented 1 year ago

The repos option can have the https://packagemanager.posit.co entry under any key, but system requirements only resolve correctly when the key is exactly RSPM

Reproduction Steps

  1. docker run -it rhub/rig /bin/bash -l -c 'R' # get an R session in the login environment
  2. options(repos = c(binary = "https://packagemanager.posit.co/cran/__linux__/jammy/latest", CRAN = "https://cloud.r-project.org")) # rename the 'RSPM' repo to 'binary'
  3. pak::pkg_install("units")
  4. No system requirements are installed so library(units) gives an error
  5. Note that installing the package AGAIN does install the system requirements

Output with RSPM

> pak::pkg_install("units")
✔ Updated metadata database: 4.18 MB in 9 files.
✔ Updating metadata database ... done

→ Will install 2 packages.
→ Will download 2 packages with unknown size.
+ Rcpp    1.0.10 [dl]
+ units   0.8-2  [dl]
ℹ Getting 2 pkgs with unknown sizes
✔ Got units 0.8-2 (x86_64-pc-linux-gnu-ubuntu-22.04) (355.29 kB)
✔ Got Rcpp 1.0.10 (x86_64-pc-linux-gnu-ubuntu-22.04) (2.14 MB)
✔ Downloaded 2 packages (2.49 MB)in 1.6s
ℹ Installing system requirements
ℹ Executing `sh -c apt-get -y update`
ℹ Executing `sh -c apt-get -y install libudunits2-dev`
✔ Installed units 0.8-2  (41ms)
✔ Installed Rcpp 1.0.10  (105ms)
✔ 1 pkg + 1 dep: added 2, dld 2 (2.49 MB) [19s]
> library(units)
udunits database from /usr/share/xml/udunits/udunits2.xml

Output with binary

> options(repos = c(binary = "https://packagemanager.posit.co/cran/__linux__/jammy/latest", CRAN = "https://cloud.r-project.org")) # rename the 'RSPM' repo to 'binary'
> pak::pkg_install("units")
✔ Updated metadata database: 4.18 MB in 9 files.
✔ Updating metadata database ... done

→ Will install 2 packages.
→ Will download 2 packages with unknown size.
+ Rcpp    1.0.10 [dl]
+ units   0.8-2  [dl]
ℹ Getting 2 pkgs with unknown sizes
✔ Got units 0.8-2 (x86_64-pc-linux-gnu-ubuntu-22.04) (355.29 kB)
✔ Got Rcpp 1.0.10 (x86_64-pc-linux-gnu-ubuntu-22.04) (2.14 MB)
✔ Downloaded 2 packages (2.49 MB)in 1.3s
✔ Installed units 0.8-2  (40ms)
✔ Installed Rcpp 1.0.10  (106ms)
✔ 1 pkg + 1 dep: added 2, dld 2 (2.49 MB) [12.5s]
> library(units)
Error: package or namespace load failed for ‘units’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/root/R/x86_64-pc-linux-gnu-library/4.3/units/libs/units.so':
  libudunits2.so.0: cannot open shared object file: No such file or directory
> pak::pkg_install("units")

ℹ No downloads are needed
ℹ Installing system requirements
ℹ Executing `sh -c apt-get -y update`
ℹ Executing `sh -c apt-get -y install libudunits2-dev`
✔ 1 pkg + 1 dep: kept 2 [6.7s]
> library(units)
udunits database from /usr/share/xml/udunits/udunits2.xml
maximsmol commented 1 year ago

Note: filed the issue on this project because inspecting an installation plan directly shows that the libudunits2-dev is not in the solution. The resolution is also different though it always shows the library (but apparently it needs to be included twice in different formats to actually work?)

gaborcsardi commented 1 year ago

Yes, good catch. This is because CRAN metadata does not contain the SystemRequirements field, and we need to extract it (regularly) and download it from another source. But we only use this other source for repos that are called CRAN or RSPM, to avoid using it for the wrong package from another CRAN-like repo.

E.g. this is the metadata:

> pkgs <- pak::meta_list()
> pkgs[pkgs$package == "units", c("package", "sources", "sysreqs")]
      package
18965   units
38358   units
                                                                                                                                   sources
18965                                           https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/units_0.8-2.tar.gz
38358 https://cloud.r-project.org/src/contrib/units_0.8-2.tar.gz, https://cloud.r-project.org/src/contrib/Archive/units/units_0.8-2.tar.gz
        sysreqs
18965      <NA>
38358 udunits-2

For the binary repo we have sysreqs = NA, for CRAN we have the sysreqs information. But then we select the binary one for installation, because binary packages are preferred, and thus no system packages are installed.

When you call it again, we use SystemRequirements from the installed package, so it works correctly.

I am not yet sure how, but we should definitely improve this. Possibly we could have a more robust detection for RSPM, instead of just using the repo name.

maximsmol commented 1 year ago

Thanks for the explanation!

I think the current behavior is fine if it's prominently documented somewhere in pak. I spent quite a bit of time trying to figure out why this wasn't working but there is nothing stopping me from just renaming the repo.

gaborcsardi commented 1 year ago

I think it is OK to use SystemRequiements whenever the package name and version matches. It is unlikely that there would be a completely different package with the same name and version number in another repository. In any case, this is something to fix in pkgcache, here-ish: https://github.com/r-lib/pkgcache/blob/7f3b3af5f79fffe64dd9910da1acfd7e6bec8faa/R/metadata-cache.R#L440-L443