CredibilityLab / groundhog

Reproducible R Scripts Via Date Controlled Installing & Loading of CRAN & Git Packages
https://groundhogr.com/
GNU General Public License v3.0
78 stars 4 forks source link

add functions and/or RStudio addin to "groundhogise" and "degroundhogise" #77

Closed stragu closed 2 years ago

stragu commented 2 years ago

I am only starting to use groundhog (thank you for a great package!), and thought that something that might be useful is an extra function to convert all instances of library(package) to groundhog.library(package, date) automatically. For example:

groundhogify(original = "script.R", new = "reproducible.R", date = "2022-05-08")
degroundhogify(original = "reproducible.R", new = "script.R")

(possibly with the option that the first function is also able to change all the dates in an already-groundhogged script.)

The idea is that I imagine many people would work with a script for a while until they are ready to publish the process and include it in a paper, or to archive it, and at this stage they would like to "fix" the versions of the packages with groundhog. It's also possible that people have large existing scripts they would like to groudhogify because they only discovered the package after completing their work.

These could also be two extra items in the RStudio addin menu, but might be an overkill...

Does that make sense? Or is it a bad idea?

urisohn commented 2 years ago

I agree this is a problem worth consider fixing. My first instinct would be to keep it within the script itself rather than have an external script process it. I think the nicest solution would be to be able to do put all the library calls inside some quotes, and have a regular groundog.library() call sandwich it.

So say the original script looks like this:

library('pkg1')
library('pkg2')
library('pkg3')
library('pkg4')

To be able to simply put that inside a goundhog call, like this:

groundhog.library(
library('pkg1')
library('pkg2')
library('pkg3')
library('pkg4'),
 '2022-04-01')
)

But I don't now if there is an easy way to create "super-quotes" that would allow escaping all single and double quotes encoutered, so that library('pkgA') and library("pkgB") would both be processed correclty.

Will think a bit more about this, let me know if you have suggestions

stragu commented 2 years ago

Thanks for considering it! I'm certainly not the best person to suggest what would be the best solution... but I just thought users might be tempted to "convert" their script with an extra line at the top of the script:

library(groundhog)
library <- function(package, date = '2022-04-01') groundhog.library(package, date)
library('pkg1')
library('pkg2')
library('pkg3')
library('pkg4')

Which seems quite cursed. So if we could deter people from doing that... :sweat_smile:

urisohn commented 2 years ago

Ha, that will work too, but feels a bit herectic to modify library() Maybe end it with library<-base:library, to return it to normal behavior immediately after.

Just tried the sandwiching solution and easy enough to make it work, as long as all library() calls use single quotes (or no quotes) Not on github's version (am working on other stuff), but may include this with v2.0.0. But your herectic solution is simple too

Thanks!

stragu commented 2 years ago

Great that you've got your preferred solution working. Would be a great addition to a very useful package. And yes, "herectic" and "cursed" are accurate descriptors for my "solution". I feel dirty posting it in public.

urisohn commented 2 years ago

Version 2.0, now on CRAN , includes the solution

groundhog.library("
library('pkg1')
library('pkg2')
library('pkg3')
library('pkg4')",
"2022-04-01")

Thanks for bringing this issue up.

stragu commented 2 years ago

Fabulous, thank you!