r-lib / pak

A fresh approach to package installation
https://pak.r-lib.org
686 stars 62 forks source link

Download only subdirectory for GitHub packages when appropriate #704

Open jashapiro opened 1 month ago

jashapiro commented 1 month ago

When installing a package from GitHub where the package is in a subdirectory of the main repository, pak::pkg_install() downloads the entire repository. For some repositories this may be a non-issue, but if the repository is large (and the R package is only a small part of the repository), this may result in a long, slow download.

Might it be possible to implement a sparse-checkout or equivalent to reduce downloads and improve install times in this situation?

gaborcsardi commented 1 month ago

Try a git:: remote, that possibly already does a sparse checkout.

It is hard to do a sparse checkout for github:: because there is no good way in the GH API to do that.

jashapiro commented 1 month ago

It does not seem that the git:: remotes in pak allow subdirectories? I did test whether remotes::install_git with a subdirectory specified used a sparse checkout, but it does not seem to.

jashapiro commented 1 month ago

Or rather, maybe a sparse checkout isn't quite what I want, as that seems to still download the whole .git folder, which can be large.