Open dgkf opened 9 months ago
@yannfeat - I just recently discovered this project: https://github.com/r-hub/repos
I think they "host" a repository simply by pointing to the github raw file url pointing to a PACKAGES
file that includes a downloadURL
used by pak
:
options(repos = c(
# raw content url
RHUB = "https://raw.githubusercontent.com/r-hub/repos/main/ubuntu-22.04/4.4"
))
And we can see the PACKAGES
file contains a downloadURL
that points to a binary build from the CRAN mirror.
I think this might be a super minimalist solution that would let us experiment with supplementing the PACKAGES
file with risk metrics.
Thinking about this a bit more, and I think this is something we could spin up without a pipeline in place.
As a demo of how this could be used, I propose that we effectively set up a clone of r-hub/repos, but we merge in the metrics produced by @AARON-CLARK in {riskscore} so the end result would look very similar to dgkf/rvalhub-repo-filters-mvp, but with all of CRAN (at least all of CRAN at the time that Aaron ran that analysis).
This is super actionable today, and we could use it to trial the user experience of using such a repository for a real-world use case. Then we could use it as a springboard for discussion around how a pipeline might make updates to the repo and how we can improve the capture of logs.
What do you think @yannfeat, @AARON-CLARK - I think between the three of us we could probably knock this out in a day. I'd be down to schedule a day where we all carve out some time to get this going.
@dgkf - I think this sounds like an excellent idea. Great trade off between effort and output by leveraging some great existing work.
@dgkf I am down, let's do it. It is great that we don't have to worry about storing the packages for now.
from discussion at R Validation Hub repos wg weekly standup
To move forward with additional steps to publish packages to some destination, we felt that the easiest path forward would be to build a monorepo of packages (possibly to be surfaced via R-Universe?).
There may be better solutions in the future. The goal of this initial design is to allow the pipeline to develop and explore lifecycle-management concepts, which will apply regardless of architecture.