drivendataorg / cookiecutter-data-science

A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
https://cookiecutter-data-science.drivendata.org/
MIT License
8.19k stars 2.44k forks source link

How would this structure change for R? #33

Closed dreyco676 closed 8 years ago

dreyco676 commented 8 years ago

I'm working on creating a similar standard for R at my company and was hoping to get some thoughts on if anything warrants changing to be R specific.

pjbull commented 8 years ago

Good question! To me, the only obvious parts are the .py boilerplate files and the commands in the Makefile. Outside of that, things should be pretty sensical. For an R project. That said, a few R pros wrote resources that we link to here.

Your R workflow may dictate additional changes. How do you generally distribute code? Is it in R packages? If so, you may want some package boilerplate. Do you use Rmd and knitr? You may want to keep source in notebooks and output in reports.

Happy to hear other thoughts as well insofar as they generalize to the data science process across tools.

dreyco676 commented 8 years ago

We have some groups that build Shiny Apps for their analytics, I'll need to talk with them to see what that all entails and if that would change the layout.

I know we'd like something super easy to execute like this for R without the dependency on Python. Is there anything similar to cookiecutter for R that we could build a pure R clone?

pjbull commented 8 years ago

The R project template has done the most thought about an R only version: http://projecttemplate.net/index.html

Don't know of any cookiecutter clones for R...

isms commented 8 years ago

@dreyco676 just found this - https://github.com/jacobcvt12/cookiecutter-R-package - could be a good option.

dreyco676 commented 8 years ago

@isms thats perfect!

pjbull commented 8 years ago

Since I don't think that we'll be migrating to a cookiecutter tool w/o the Python dependency (not even sure what the options are), and the structure works fine for R projects after it is generated, I'm going to close this issue for now.

Add #49 which might be nice for R users.