weecology / LDATS

Latent Dirichlet Allocation coupled with Bayesian Time Series analyses
https://weecology.github.io/LDATS
Other
25 stars 5 forks source link

paper-comparison vignette comments #120

Closed ha0ye closed 5 years ago

ha0ye commented 5 years ago
ha0ye commented 5 years ago

Can @juniperlsimonis or @diazrenata clarify the reproducibility properties of this vignette? I can get it mostly working, but there's a lot of unclear dependency on code from @emchristensen and pre-built data objects.

For the record, my (admittedly limited) view of vignettes is: (1) they demonstrate utility or documentation of the package (2) the code contained within them can be run by a user to get the same output

Right now, the vignette is in a bit of a weird spot; because of how long it takes the analyses to run, and large output objects, there is loading of saved things, and unexecuted code chunks. This eases the burden of developing the code in the vignette, but doesn't necessarily get to an end-product that is fully reproducible for a user.

I've done a bit of cleanup in the paper-comparison-vignette-patch branch, but need some further guidance on how you all envision the vignette being used (and where intermediate objects should be stored - cloning the repo downloads nearly half a gig of files right now).

juniperlsimonis commented 5 years ago

yeah so i'm not necessarily concerned about a user re-running the code in this vignette. the goal here was primarily to show how the new code differs results-wise from what was in the paper, not to have the code be re-runable by the average user. perhaps there should be some text to that effect ("this vignette's goal is shows the differences, a typical user should not be concerned with re-running the displayed code") in the document?

ha0ye commented 5 years ago

I agree that you wouldn't expect the average user to run it, but I think if the goal is to communicate that one can run the code to get the qualitatively similar results, that it needs to be reproducible.

Including a disclaimer about computational time is good to include so that it is clear that the target audience is not "try this vignette example to learn about the package" but "here's how you would reproduce this particular set of results and comparisons".

juniperlsimonis commented 5 years ago

totally with you on that. i think it's just a matter of figuring out how best to pull in the code from the other repository, etc etc. @diazrenata can you handle tackling this with @ha0ye since you were the one deep in it and working on this vignette?

diazrenata commented 5 years ago

happy to!

juniperlsimonis commented 5 years ago

the other thing i'm curious about with this vignette is the size of the cached files. is that going to be an issue with CRAN?

edit: nm, hao has this handled in #124