Open rayi113 opened 9 years ago
Could the "Code as a research object" project (https://github.com/mozillascience/code-research-object) from Mozillascience be a useful tool?
@alesarrett Yes, absolutely. See the work we've already done as part of that project in my codemeta repository, which has crosswalks among existing software metadata efforts and a proposed JSON-LD context for software metadata. We also have a working PROV extension in DataONE. For an example of the kind of provenance tracing I think needs to be widespread, see the Ma et al. paper titled Capturing provenance of global change information, doi:10.1038/nclimate2141.
Use Case: Link Papers to Script Runs and scripts
Goal and Summary
A scientist can provide citation links to scholarly papers that were derived from particular executions of a script or model. These links can take the form of links to published identifiers for the execution trace, or from the trace to the paper via e.g., a DOI. As a scientist using e.g., R or Matlab, The goal is to link a paper to a given script run to document the exact process used to derive published data, figures, or tables. In R or Matlab, after executing a script and recording provenance information about the run, a researcher can later update the execution's provenance information by providing a link to a permanent identifier (such as a DOI) of a published document. Other researchers can discover the papers from links of the script executions.
Why is it important and to whom?
It is critical to computational reproducibility
It improves discovery via finding all papers related to particular analyses
Why hasn’t it been solved yet?
Repositories for execution traces are rare, and those that exist are not well established
Tools for generating execution traces are not widely adopted
Standards for representing traces have been in flux (e.g., OPM, PROV, etc)
Actionable Outcomes
Additional Information and Links
DataONE Use Case 42A: Link Papers to Script Runs