ropensci / auunconf

repository for the Australian rOpenSci unconference 2016!
18 stars 4 forks source link

Optimizing reproducible research with R and related tools #8

Open benmarwick opened 8 years ago

benmarwick commented 8 years ago

One of the great strengths of R is how it enables reproducible research. I'm interested in the use of R packages as research compendia to accompany published articles and reports. I'd love to learn more and see some demos of how people are using R and related tools (such as Docker and make) to simplify the reproducibility of their research, and find out where the pain points are for others.

njtierney commented 8 years ago

I'd love to learn more about using Docker and make to simplify reproducible research.

I can't remember where I read this but someone was saying that each paper or new method should have a shiny app alongside it to demonstrate the use of the tools. What do you think about that?

benmarwick commented 8 years ago

Yes, an accompanying shiny app is an interesting proposal (maybe you saw this paper?). I guess if interaction and modification of plot parameters is a priority, then shiny would be a good option. Have you seen any good real-life examples of this? I guess my attention has been focused on reproducing the plots and numbers in the published article, since that's the most useful thing to me at the moment.

All I've seen are toy examples, such as this box-plot maker (1, 2). Perhaps a more substantial and useful effort might by a shiny app that generates the plots that Weissgerber et al recommend. These are quite hard to do in Excel or SPSS, etc. I've written R code for generating these kinds of plots.

That suggests a related challenge, a shiny app to do something very useful for non-R users, but very difficult to do well in Excel.

njtierney commented 8 years ago

I can't find the example, but I think it was a blog post or maybe even an a tweet saying something like:

"All packages and papers should have an accompanying shiny app with them"

It's great to know that this issue has been written about in journals, gives a good strong motivation for working on these projects.

From this I guess I see a few possible options for projects:

  1. Turn your R code reproducing figures from Weissgerber into a shiny app
  2. Shiny app that does tasks that are hard to do in Excel. (box plots, density plots?)

Perhaps we could also work on:

mattwatts commented 8 years ago

I've been using shiny apps to do analysis and produce figures for a paper we're working on and am interested in developing this idea further.

adamhsparks commented 8 years ago

I'm keen to work on this topic (well, one of several). It's rather timely since I'm working with a colleague to bring this idea forward in our field of plant pathology. There's a few of us using R and making our research reproducible, we'd like to see more people making an effort.

I've not really touched Shiny, but can certainly see the benefits and last night I was messing around with Docker to install a Linux instance to test R packages.

ghost commented 8 years ago

Kaggle scripts is one example that demonstrates the potential of Docker. Kaggle has an identical Docker image that runs all their R scripts. With their Dockerfile (together with the data and scripts, perhaps shipped separately) one could reproduce any of the results.

Turns out Rocker already has an RStudio container. But I'm guessing different disciplines/use cases would require different setup/packages/tools. So perhaps one idea is to have a package that writes Dockerfile??

@benmarwick You mentioned the use of make.. Do you have in mind using make to automate the data analysis pipeline, as in here? Do you already know of any (other) examples?

benmarwick commented 8 years ago

Yes, there are some nice examples of make for scientific research workflows by Karl Broman and Carl Boettiger. So far I've not used make myself, preferring to use only knitr for this purpose. That said, I'm quite interested in remake and look forward to trying that.

The dockertest package contains functions for generating Dockerfiles from R packages and other R projects. But I've not had any success with it, and have been writing my dockerfiles by hand (mine are pretty simple).

jesse-jesse commented 8 years ago

This project got 6 votes at the AuUnconf. People that were interested in continuing discussions around this issue after the Unconf were: Jessie (me), Adam, Peter B, Miles.

mensurationist commented 8 years ago

I'd like to be in that loop, please.

Andrew R.