Open dlebauer opened 7 years ago
Here are some use cases:
Another use case for reproducibility is enabling researchers to independently reproduce any of the data products and other results ultimately derived from input sensor data sets and published by TERRA REF. (This may be implied by use case 3 above, 'Support Journal and Funder expectations of reproducibility').
For this as well as the other use cases for provenance/reproducibility, it may be worthwhile elaborating each use case to yield user stories sufficiently detailed to highlight what would really be required, via what sequence of steps, using what data/metadata and compute resources, and by whom, to achieve the desired reproducibility result or to answer a particular class of provenance queries.
For example, if someone wanted independently to reproduce one of the data-products/data-sets/sets-of-metadata published by TERRA Ref, would they need to use their own instances of Clowder and RabbitMQ to serve as a workflow engine, in order to re-perform all of the relevant computations? Would this be practical or even feasible? Would they easily be able to discover and install all of the relevant extractors and input data sets needed for the calculation? (This last question implies several additional provenance use cases to consider targeting.)
If someone wanted to use the centrally maintained instances of Clowder/RabbitMQ for this purpose instead (i.e. not truly independently of the official project software installations and computing infrastructure, so only ‘recomputing’ the result, not ‘reproducing’ it in the more rigorous sense), would they be able to request that the correct versions of each extractor be used for these re-computations?
Etc.
It sounds like there is consensus that TERRA Ref products might be more practical to reproduce (especially by others, using their own compute resources) if one could export a representation of the effective workflow (sequence of extractors, parameter settings, and references to data sets) that ultimately yielded a particular product. Such an exported workflow representation could allow for rerunning just that workflow independently of Clowder/RabbitMQ, i.e. as a standalone software pipeline (or Python script or Makefile, etc) with the outputs of one extractor passing directly as inputs to the next extractor, thus making the workflow runnable in the absence of significant computing infrastructure (e.g. using just Python and a few pip-installable Python packages).
I generally like the idea of making it easy to reproduce results outside of the original computational infrastructure, because such infrastructure easily can itself become hard to reproduce or to document rigorously and understandably (VMs go part of the way, and software containers go further, but there's nothing like a from-scratch installation to convince one that a reported result can be reproduced. (We've probably all heard of cases where a seemingly significant result could not be reproduced by the original researchers following a minor version upgrade of a C++ compiler).
Most of the code can be run independent. So you can download the code, data and run the code on the data. Any parameters used, should be either documented, in the code, or hopefully stored as metadata.
The use case that @tmcphillips outlines is important - it is related to the need to be able to deploy algorithms on different platforms (e.g. deploying a pipeline on Arduino so it can compute values of interest before storing them).
I think the key is implementing a SOP and framework so that we can remove the 'most', 'should' and 'hopefully' from @robkooper's statement, and ensure that the process of downloading the code and data and running the code on the data is as easy as possible. Some of this should be captured in the READMEs and tests that are proposed in #160.
Description
Please provide use cases below as comments, comment on use cases, and use the 👍 flag!
@robkooper @max-zilla @pless @terraref/developers @terraref/standards-committee @ludaesch @tmcphillips