ropensci / unconf17

Website for 2017 rOpenSci Unconf
http://unconf17.ropensci.org
64 stars 12 forks source link

Analysis templates #51

Closed kbroman closed 7 years ago

kbroman commented 7 years ago

A common use of package vignettes is as a template for analysis: a user attempts to plug in their own data and try out the various package features. Some package vignettes work better for this than others; sometimes it's especially onerous to modify a vignette for custom data.

I was wondering whether there might be value in a tool to facilitate this, or at least guidelines on making vignettes that are more easily reused.

And actually, what I had in mind, was a template for basic data diagnostics or vanilla data analysis for data of a particular form.

That is, in my own research on complex trait genetics in model organisms, there are a standard set of data diagnostics that I always look at, and then there are a standard set of basic analyses that I always perform. But I find myself doing such things interactively, from scratch, every time. Probably I should have some Rmarkdown template that I plug data into, to get the basic results, which I can then customize according to the particular situation.

As another example, maybe you want to take some cool analysis of the text of Hamlet, but replace Hamlet with Fences. Or you are writing up a cool analysis of Hamlet, and you want to do it in a way that someone else could easily plug in some other text and get some not-completely-unreasonable results.

So like explainr, but for the full analysis report. Maybe @hilaryparker already has a solution for this but I've just not paid proper attention.

noamross commented 7 years ago

You might be able to do this with an elaborate R Markdown document template, which can be nicely wrapped in a package. Pulling up the template gives you an Rmd ready to go, but easy to customize. You could make it a parameterized report, with some parameters set in the YAML header, and/or have comments in the main body for what the user would modify.

kbroman commented 7 years ago

@noamross Nice! Both great possibilities. Maybe a parameterized report as the primary source, plus a function to convert such to a document template for others to use.

jasdumas commented 7 years ago

I really like the idea of leveraging a parameterized report as a template. I wonder if a similar method with rtutor could be included in a vignette within a package to help reinforce analysis methods?

daroczig commented 7 years ago

I have worked on such templating engine using markdown and a custom YAML header, and although it's still in production in a bunch of web applications, it could use some love as did not receive many updates in the past couple of years. Anyway, if it might be interesting for this, I'd love to get back to that project and refresh it to today's standards: http://rapport-package.info

stefaniebutland commented 7 years ago

@kbroman Coming from a similar-ish research background, I think this would be really valuable - lots of people probably craving this kind of thing for reproducibility but don't know it's within reach.

elinw commented 7 years ago

I use actual templates rather than vignettes to show work processes in my teaching package. I even have a template called "blank." I know that content in the default new document is supposed to be helpful but in my experience teaching it doesn't really work well for beginners. You can use it once to show them how knit works but then having a workflow that says "delete all that stuff" is just confusing. I'd much prefer to have blank and samples of different kinds of analysis and reporting formats.
Actually I use vignettes but not to show how to do analyses, I just use them to explain things in a one way manner.

batpigandme commented 7 years ago

@elinw That's a good point re. confusion around "delete all the stuff"-- if left un-deleted (assuming that doesn't cause any fatal errors), it can easily lead to "I don't know what this thing does, but I'm scared to delete it."

kbroman commented 7 years ago

Parameterized reports, mentioned by @noamross, basically does what I had in mind, so this seems like a low-priority item for the unconference, and I'm going to close it.