Open goldingn opened 9 years ago
I wrote this function to grab the versions of names packages in a named library (by default, whatever's returned by .Library
):
# return the version of the package `pkg` installed in library `lib`
getVersion <- function (pkg, lib = .Library) {
# vectorise by recursion
if (length(pkg) > 1) {
ans <- sapply(pkg,
getVersion,
lib)
return (ans)
}
# get path to package description
desc_path <- sprintf('%s/%s/DESCRIPTION',
lib,
pkg)
# check it exists
if (!file.exists(desc_path)) {
warning (sprintf('package %s is not in the specified library, returning NA',
pkg))
return(NA)
} else {
lines <- readLines(desc_path)
vers_line <- lines[grep('^Version: *', lines)]
vers <- gsub('Version: ', '', vers_line)
return (vers)
}
}
e.g.
getVersion('zoon')
[1] "0.3.2"
getVersion(list.files(.Library))
abind AntWeb ape
"1.4-3" "0.7" "3.3"
assertthat base base64enc
"0.1" "3.2.2" "0.1-3"
BH biomod2 bitops
"1.58.0-1" "3.1-64" "1.0-6"
boot brew caTools
"1.3-17" "1.0-6" "1.17.1"
...
which should be a start.
I imagine that there are headaches down the road with multiple libraries.
We could enforce .Library
as a default, allowing users to change it if they want, provided they tell us what library they are using. I.e. this as the usually unseen default:
workflow(..., library = .Library)
Packages which provide alternative approaches include:
packrat
which sets up a package in a users' directory and does some version control on it. I don't think this is very close to what we want though.
checkpoint
which talks with a Revolution R server that copies the CRAN binaries at midnight every day. We could force users to install packages afresh (rather than using whatever they already have installed) in every zoon, and then be able to fetch those again. Assuming that is that the package wasn't updated during the day in question and that Revolution R maintains that server...
I think we'd be better off avoiding these two, but open to suggestions.
Reinstalling packages on a regular basis sounds like a big turn off and a pain.
Maybe the checkpoint idea but by default zoon just uses whats available (and records what it used.) Then have an argument to enforce perfectly reproducing a workflow. This would only be used if someone is failing to reproduce a workflow.
Theres forceReproducible already in workflow. But most of this discussion really refers to running rerunWorkflow
on a workflow object.
Right, the checkpoint thing would work if we only installed that day's package in a force reproducible call.
ReRunWorkflow calls will be rare enough that the overhead of installing specific versions afresh shouldn't be an issue.
Maybe we just try to match by package version visible in the library by the end of the workflow as the standard method. That doesn't require fresh downloading.
We could do checkpoint in forceReproducible calls if needed, though I'm not sure if that would add much...
Would be great if we could work out how to install binaries of specific versions from checkpoint's MRAN server (which we can query by date). checkpoint may do this internally, or we could scrape something...
So it looks simple enough to scrape CRAN's archives for version publication dates, then download the specific package version as a binary from MRAN, avoiding the checkpoint package altogether (we can get the required day's mirror as e.g. MRAN.revolutionanalytics.com/snapshot/20140909)
Sorry, that's https://MRAN.revolutionanalytics.com/snapshot/2014-09-09
ooh, look at this new R package that's appeared that does just what we want: https://github.com/goldingn/versions
will get it on CRAN soon
On CRAN now: https://cran.r-project.org/web/packages/versions/
Nick you are a machine!
Did you just reinvent switchr? https://github.com/gmbecker/switchr
Ha! I looked around but never found that one.
versions
has no dependencies is definitely multi-platform. Judging by the vignette, switchr
needs RTools on Windows since it installs from source, but it does have a nice facility for handling multiple libraries.
We can go with whichever works best!
So currently the session info is captured in a workflow
w <- workflow(UKAnophelesPlumbeus,
UKAir,
OneHundredBackground,
LogisticRegression,
SameTimePlaceMap)
w$session.info
R version 3.2.3 (2015-12-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] rgdal_1.1-3 viridis_0.3.2 htmlwidgets_0.5
[4] leaflet_1.0.0 randomForest_4.6-12 dismo_1.0-15
[7] zoon_0.4.21 raster_2.5-2 sp_1.2-2
loaded via a namespace (and not attached):
[1] Rcpp_0.12.3 magrittr_1.5 munsell_0.4.2
[4] colorspace_1.2-6 lattice_0.20-33 R6_2.1.2
[7] httr_1.1.0 plyr_1.8.3 tools_3.2.3
[10] grid_3.2.3 gtable_0.1.2 htmltools_0.3
[13] yaml_2.1.13 digest_0.6.9 rfigshare_0.3.7
[16] RJSONIO_1.3-0 gridExtra_2.0.0 ggplot2_2.0.0
[19] bitops_1.0-6 RCurl_1.95-4.7 scales_0.3.0
[22] XML_3.98-1.3 httpuv_1.3.3
I can then use your package to install all these packages at the beginning of the re run
# Something like...
pkgs <- c(w$session.info$otherPkgs, w$session.info$loadedOnly)
to_install <- as.matrix(sapply(pkgs, FUN = function(x){
return(c(x$Package, x$Version))
}))
install.versions(pkgs = to_install[1,], versions = to_install[2,])
This seems fine, but I worry about overwriting all the versions that the user currently has installed. We need a way to reverse that afterwards. I could simply do the same thing in reverse (capture session info and reinstall the previous versions at the end of the workflow), but i wonder if there is something more elegant?
I also have an error message which I have posted here https://github.com/goldingn/versions/issues/5
It's a good point, maybe something switchr-like to install the packages in a temp library would be a good shout?
Perhaps that behaviour should be optional as it incurs a significant overhead installing all the used packages and their dependencies. Something like a forceReproducible
argument for re-running someone else's workflow? Or a cleanLibrary
option?
Thanks, will check out the bug!
currently zoon ignores the fact that packages change version between the time a workflow is written and when it is reproduced.
If we could store the versions for all packages being used when running a workflow, we should be able to reinstall the same version when re-running it.
e.g.
devtools::install_version
will rebuild from the tarballs in the CRAN archives