Open maurolepore opened 6 years ago
This is a great approach. It is of course related to #42, but potentially applies very different workflows. For projects where I'm not compiling the final output I like to have an outputs
folder which has not only images and tables but an Rmd or text file output with all the essential quantitative values that make their way into the manuscript. Usually things can be traced back from the filenames there.
drake
+ literate programming may help a bit. Drake
's main example's has a data analysis workflow with this R Markdown report at the very end. The active code chunk has calls to loadd(fit)
and readd(hist)
, which serve to
drake
to treat fit
and hist
as formal dependencies (so drake::make()
rebuilds the report.html
if there is a change to fit
or hist
.) Even if you don't care about Make-like build management, you can still see where these data objects fit into the pipeline.In that sense, using and annotating an artifact are one in the same.
I am curious to know the views of @gmbecker and @duncantl on the original issue. As I understand it, provenance is a major focus of trackr
, RCacheSuite
, and CodeDepends
.
Edit: as for linking data objects back to the source code, the dependency graph shows the functions that generated fit
and hist
. That's an important point I forgot to add. The previous graph excluded functions. See below for the full graph.
Awesome! I'm learning so much and the unconference hasn't even started! Thank you!
It's such a fantastic crowd! I wish I could be at unconf to soak up more knowledge in person.
Summary:
How do you link a result in your manuscript back to its source code? This is fundamental to reproducible research. It seems basic and straight forward but, in the wild world I live, it is not. Research gets messy quickly: After a few weeks out of touch with a project, wish me luck finding my own stuff; and forget about finding code in a project managed by someone else.
My inelegant solution is this:
Is there a tool or better approach? What general recommendations do you have for researchers across a range of willingness to use version control and RStudio projects?