Open rgrant301 opened 3 years ago
The intention of a compendium is to increase the reproducibility of the published work
Reproducibility is a critical component, for sure, but what about re-use? How do you think the research compendium framework facilitates different groups building on each others' work?
could be interpreted differently by different people, or just be difficult to implement because of the lack of detailed description
Botvinick-Nezer et al. (2020) Is a great paper about the pitfalls of this exact issue. Strongly recommended read!
Versions:
A research compendium goes along with published work to describe the data, environment and code used in the analysis to produce the output in the published work. The intention of a compendium is to increase the reproducibility of the published work, by allowing users to replicate the original analysis methods using the original data.
The majority of the data was published with the paper. Smaller tables were available with the paper itself as a supplement, and larger datasets were made available at publicly available databases (e.g. Gene Expression Omnibus). The methods for analysis were described in the methods and supplemental methods, but no code was provided. For a specific example of where another scientist might have trouble replicating the results: in our paper analysis, we create and use a human-chimpanzee composite genome, and describe various rules for filtering out incorrectly matched reads. The composite genome itself is not provided, and the rules for filtering out incorrectly matched reads could be interpreted differently by different people, or just be difficult to implement because of the lack of detailed description and lack of code availability. I did not conduct this portion of the analysis for the paper, and I would have trouble implementing it from scratch, even though I’ve attended many meetings describing the methods and the raw data came from my own experiments.