Open jennybc opened 8 years ago
Thank you @jennybc . this is fantastic feedback. I'm going to implement it. Where do you find there fantastic picts? i love the basmati rice picture. i'm going to "steal" it (casually while citing fully :)
Jenny - per the scripting component. that is definitely a very good point to make. However it actually isn't really a part of any of the facets
Would you just put it up front? Or perhaps this is the low level component of automation? So maybe i make automation #2?
thoughts? leah
Basmati rice photo was from a tweet actually. I have a bunch of credits and links associated with that set of slides:
https://github.com/jennybc/happy-git-and-github-for-the-user
The tweet:
Re: scripting. I'd put it up front. You could work it in w/ rewording this (i.e. there needs to be code, not just mouse clicks):
How to Make Work Reproducible
For research to be reproducible, the research products (data, code) need to be publicly available in a form that people can find and understand them.
As for the basmati rice (how can we talk about this so much 😁?), I don't interpret is as "random". I assume the container did indeed hold rice, when the person was all excited about their new label maker. But the novelty of labelling everything in the kitchen wore off and someone needed to store cookies and was too lazy to change the label. Which is how elaborate comments and READMEs eventually grow out of sync with the stuff they're supposed to document.
Re:
https://github.com/NEON-WorkWithData/slide-shows/blob/gh-pages/intro-reprod-science.md
Mainly: I think they look great! So take or leave this feedback as you see fit.
You never explicitly say that scripting (coding) is almost an absolute requirement for reproducibility, vs. point-and-click. I guess point-and-click workflows can be reproducible. Sort of? But script everything you can. This is already implied but I'd say it loud and clear.
I am not a big fan of explicit documentation. I think if you can make the thing "explain itself", it is better to not write documentation. This holds for R objects, functions, function arguments, datasets, scripts, directory names, repo names, all of it. Because this lowers the risk of the documentation becoming wrong. The "basmati rice versus cookies" problem: https://speakerdeck.com/jennybc/happy-git-and-github-for-the-user?slide=34. Of course, some documentation must be written! But I think novices underestimate how quickly their enthusiasm for this will evaporate. Be a minimalist and get really good at naming things.
Therefore I would swap the order of topics 1 and 2 to reflect that documentation is plan B, i.e. for things that can't be self-explanatory, through awesome organization and naming. Documentation is also for things that are so important they bear repeating and justify the extra vigilance to keep them up-to-date.