CredibilityLab / groundhog

Reproducible R Scripts Via Date Controlled Installing & Loading of CRAN & Git Packages
https://groundhogr.com/
GNU General Public License v3.0
78 stars 4 forks source link

MRAN URL #81

Closed nnlkcncff closed 1 year ago

nnlkcncff commented 2 years ago

Please, add possibility to change MRAN URL, we are using JFrog Artifactory for proxing repos.

urisohn commented 2 years ago

Hi, i am not familiar with JFrog. Does it offer a mirror to MRAN or is it an alternative archive? e.g., if you want to install, say, 'metafor' as available on 2020-06-01, how would you request that from JFrog?

nnlkcncff commented 2 years ago

It will be something like this:

R -e '
  options(repos = c("CRAN" = "https://${jfrog-domain-name}/artifactory/cran-remote"))
  groundhog::groundhog.library("metafor", "2020-06-01")
'

Artifactory is a web application with a lot of repository facilities, located on some node in the internal network.

In this context, its main function is to proxy an external repository. Artifactory accepts requests from package managers and transfers data between external repository and package manager and caches received data, i.e. packages received through Artifactory remain in Artifactory cache.

All kinds of packages are supported, not only CRAN, e.g. DEB, RPM, PHP Composer, Docker...

Here is full list of supported packages: https://www.jfrog.com/confluence/display/JFROG/Package+Management

In my example, Artifactory provides two links:

I was expecting to see the ability to specify an alternate MRAN address to access the CRAN snapshots.

urisohn commented 2 years ago

What's a use case where an R user wants to install a package and actively prefers to avoid using the default MRAN location for it?

Keep in mind I am a not a developer, but a behavioral researcher solving a specific need: R scripts that are used to analyze data in published papers should run in 5-10 years, not just today, and they should run in anyone's machine, not just the original creators.

I am currently unable to work on groundhog,but my to-do includes paralllelization of source installation and incorporating bioconductor. These seem like big return investments. What's the benefit to an academic user of having this functionality you propose? Do you think it can rival those benefits? (honest question since I had never heard of jfrog so cannot easily evaluate it and maybe it is more useful for grondhog's goal that it seems at first)

nnlkcncff commented 1 year ago

I apologize for the delay in the answer. At the moment, I am not connected with the project, in which I have been useful to me the functionality I requested.

You ask what is the reason for avoiding the standard MRAN?

The security policy of many corporations limits direct Internet access. Products like Sonatype Nexus and JFrog Artifactory proxying external resources, caching data, allow you to create their own repositories... These products become a single access point to any artifacts in many companies and institutions.

Returning to our specific case, it turns out that the mirror of the MRAN repository appears in the companies, the condition of which is constantly updated to match with the original MRAN, that is, only the link changes, but reproducibility remains.

The use of repository management software pushes not only inner policies — these are convenient and useful tools for modern development processes, sometimes even necessary (for example, the maximum number of possible appeals in a certain period of time is limited to some repositories).

In addition, CRAN snapshots have not only MRAN, but also RSPM (RStudio Package Manager), and, possibly, in some other repositories MRAN is not the only public provider of CRAN snapshots.

Examples of repository managers:

More about RSPM:

urisohn commented 1 year ago

OK, I understand. I think that use case is too far from the use case for which groundhog is designed for (long-term reproducibility of academic research) so I don't see myself working on incorporating it. If someone else wanted to build a separate package, though, that would be great and happy to share any info on how groundhog does this or that which may prove useful to them