Closed jeromyanglim closed 12 years ago
There are a few exceptions:
I also feel that it is not enough to simply share a repository. It's important to make the repository user friendly. User friendly could mean:
I see science as a collaborative process. One of the major benefits of reproducible research is that it helps others see exactly how to analyse research data of a given sort.
However, it is possible that some researchers might see this as a negative thing as they seek to be a dominant figure in a particular area.
Naturally, this raises the question of why anyone would use "un-automatable" software. However,
There is a wide spectrum of data analytic misconduct. If we take a legal perspective, we can think of different kinds of intentions (intentional, reckless, negligent) and consequences (how consequential was it to the paper's findings, etc.).
I have heard advocates of open source software state that one reason why open source software is better than proprietary software is because such software is on display to the community. A similar process would possibly operate in a reproducible data analysis context. Researchers would be more inclined to adopt workflows and procedures that keep their analyses clean and tidy. They would be more likely to incorporate quality control procedures that check for possible errors.
It would be interesting to see how journal articles deal with potential increases in errata that might emerge. At present while journal articles permit the incorporation of errata, it generally seemed to me to be a fairly big deal. In contrast, software is often framed as a work under development where bugs are identified and gradually fixed. Admittedly in some respects, journal articles are more static in their scope and application than are
In some instances, sharing various algorithms or meta data may be prohibited by copyright restrictions.
Clearly most researchers don't anlayse their data with reproducible data analysis tools like knitr and Sweave.
For practical purposes I operationalise reproducible analysis as:
knitr or sweave with R and LaTeX and a build script such as a makefile shared as a self-contained archive file is one way of satisfying the above criteria.