Open thibautjombart opened 7 years ago
I think that this is something that @hrbrmstr and @ironholds have discussed
See also: http://ropensci.org/blog/technotes/2016/10/19/gpg-release by @jeroen
Cool stuff!
Naïve question: does anyone know of any document discussing how to apply the Reproducible Builds framework in an R context? https://reproducible-builds.org/docs/definition/
Multiple discussions in this issue list have made me wonder about this.
Should the checksums be calculated for the built package tarballs? If not, how does this idea (manually run in one's current package working directory) differ from version control hashes?
I am discovering reproducible-build, but that looks like a good place to start.
This would be run on the tar.gz which could be generated on the fly from the current directory, provided it is a package.
I could see some use for running it on the file content as opposed to the tar.gz, e.g. to track down which bits of the package differ from expectation: R code (parsed or not), content of /src, /man, etc. This sounds like a more complex task, though the fact that we have a fixed package structure should help. Indeed it looks like that would overlap quite a bit with what GIT does with tree objects (not sure how other vcs handle this).
Just spotted the following in Kurtzer et al., Singularity: Scientific Containers for Mobility of Compute, PLOS One, May 2017, which is, despite being US oriented, relevant to this idea:
"Via direct integration of SHA256 hashing ..., Singularity will provide a method of container validation that guarantees that a container image being distributed has not been modified or changed. This is essential in ensuring compliance with White House Office of Management and Budget Circular A-110, which states the need for data preservation and the ability to validate research results for federally funded projects."
(not sure if that CIRCULAR A-110 REVISED 11/19/93 As Further Amended 9/30/99 document will survive - removed from the official site - but that's a different story)
It would be cool to sit together with a few folks who are interested in security to generate, publish, and sign PGP keys, and discuss how to improve security of package installations. Some use cases:
Having a few prominent r developers established gpg keys will make it easier to build a gpg trust community and work on tools around this.
@hrbrmstr @hadley
Something we discussed with Dirk Schumacher and @richfitz. The idea would be to verify a package's integrity using hashing. Example use could be something along the lines of: