r-lib / pak

A fresh approach to package installation
https://pak.r-lib.org
639 stars 56 forks source link

Feature request: support for subdir spec with Git remote #646

Open klmr opened 2 weeks ago

klmr commented 2 weeks ago

It is currently not possible to install a package inside a Git subdirectory with the git:: remote.

As an example of this, the ‘ArvadosR’ package is currently hosted at https://github.com/arvados/arvados/tree/main/sdk/R. However, it cannot be installed by pak::pkg_install(). For instance, the following fails (since ‘pak’ attempts to treat the entire specification as a Git repository):

pak::pkg_install("git::https://github.com/arvados/arvados.git/sdk/R")

(For this particular example, the github:: remote type could be used instead. However, we have a use-case with internally hosted packages that require the use of the git:: remote.)

I’m not sure how to best implement this — e.g. the above syntax would be ambiguous. Maybe via a parameter? I.e.

pak::pkg_install("git::https://github.com/arvados/arvados.git?subdir=sdk/R")

(cf. r-lib/remotes#777, although that turned out to be different.)

gaborcsardi commented 2 weeks ago

Yes, this was indeed not implemented because it needs new syntax.

We could take the part ending with .git as the git repo and the rest as (sub)directories, but that might also fail in some cases.

Or we could have a double // or a : as the separator. Or indeed we could use a parameter. Probably a parameter is the best.

git::https://github.com/arvados/arvados.git//sdk/R
git::https://github.com/arvados/arvados.git:sdk/R
git::https://github.com/arvados/arvados.git/sdk/R?subdir=sdk/R
dgkf commented 2 weeks ago

Just chiming in as there might be some opportunity for standardizing here.

In https://github.com/r-lib/pkgdepends/pull/353, we used <gitlab_url>/-/<subdir>@<ref>, using /-/ as a separator as it's a reserved separator in gitlab's paths, but maybe the query parameter style would be better if it can be used consistently.

Not sure if this is being overly defensive, but you may consider namespacing a parameter name ?pak_subdir= if it's going to be parsed and used by pak to distinguish it from parameters that may be meaningful.

I also suggested a more literal syntax in that thread (https://github.com/r-lib/pkgdepends/pull/353#issuecomment-1977455855), and I'm still rather partial to that as a verbose solution. @gaborcsardi mentioned some issues, and I'm not sure I totally groked the downsides - especially if tools start expecting really bespoke syntax.

klmr commented 2 weeks ago

/-/ might be a good idea, though we should be aware that it’s not a reserved path element in general, just for GitLab. So other Git remote repository URIs might use it.

I have no good intuition whether that’s a problem: I have never seen this usage for repo URIs, and I am confident that it’s going to be very rare, if it exists at all, but it’s still technically breaking backwards compatibility, and somebody, somewhere might be relying on this behaviour.