Closed ha0ye closed 5 years ago
Can @juniperlsimonis or @diazrenata clarify the reproducibility properties of this vignette? I can get it mostly working, but there's a lot of unclear dependency on code from @emchristensen and pre-built data objects.
For the record, my (admittedly limited) view of vignettes is: (1) they demonstrate utility or documentation of the package (2) the code contained within them can be run by a user to get the same output
Right now, the vignette is in a bit of a weird spot; because of how long it takes the analyses to run, and large output objects, there is loading of saved things, and unexecuted code chunks. This eases the burden of developing the code in the vignette, but doesn't necessarily get to an end-product that is fully reproducible for a user.
I've done a bit of cleanup in the paper-comparison-vignette-patch branch, but need some further guidance on how you all envision the vignette being used (and where intermediate objects should be stored - cloning the repo downloads nearly half a gig of files right now).
yeah so i'm not necessarily concerned about a user re-running the code in this vignette. the goal here was primarily to show how the new code differs results-wise from what was in the paper, not to have the code be re-runable by the average user. perhaps there should be some text to that effect ("this vignette's goal is shows the differences, a typical user should not be concerned with re-running the displayed code") in the document?
I agree that you wouldn't expect the average user to run it, but I think if the goal is to communicate that one can run the code to get the qualitatively similar results, that it needs to be reproducible.
Including a disclaimer about computational time is good to include so that it is clear that the target audience is not "try this vignette example to learn about the package" but "here's how you would reproduce this particular set of results and comparisons".
totally with you on that. i think it's just a matter of figuring out how best to pull in the code from the other repository, etc etc. @diazrenata can you handle tackling this with @ha0ye since you were the one deep in it and working on this vignette?
happy to!
the other thing i'm curious about with this vignette is the size of the cached files. is that going to be an issue with CRAN?
edit: nm, hao has this handled in #124
download necessary external files
paper-comparison
vignette starts by loading in R data files, but this assumes that the code is being run from within the project folder that has been cloned or downloaded from GitHub. I don't think this will work for folks who are installing from GitHub or from CRAN. These lines can be modified to download the necessary files from e.g. GitHub.check package dependencies
suggests
field ofDESCRIPTION