Closed iamciera closed 10 years ago
@iamciera Nice start on the workflows!
Couple of thoughts on workflows:
I uploaded an example workflow for R + Latex, here. With graphical outline, README, R and LaTeX files.
I just did the outline quickly with Inkscape - so I won't win any design award ;) That's my general workflow that I find useful.
For format and coding, we could also use tkiz example here
Here's one of my favourite workflow diagrams:
context: http://kieranhealy.org/blog/archives/2014/01/23/plain-text/
@benmarwick Nice, thanks for this!
Looks very interesting an promising!
Great! I will add everything and think about other ways to present them.
Excellent schematics, well done! One question, should the file type for article in the R box of the Healy workflow be rmd rather than CSV/text? And what do the coloured dots signify?
I was going to bring up the same thing as Ben re: csv/txt vs .Rmd.
Also instead of calling out a single authoring tool (Mu/Mou) perhaps we could be more agnostic and outline a few of the tools (Mu/Mou, RStudio, Text Editors, Authorea etc.) in the Intro to Tools section?
Cheers, Jeff
Another possible workflow could simply be RStudio. I have heard (haven't played with it yet myself) that the latest preview version of RStudio includes export of .Rmd to .pdf, .docx, etc. I believe they have rolled Pandoc in. Might not make a compelling figure, but could probably become a common workflow for many.
Maybe we could/should add some discussion of advantages/disadvantages of the workflows? E.g. Having Code and Text in one place may be good for short texts or presentations, but for longer research papers I prefer to have Code and Text separated (as I usually first develop (large) code and then write the paper).
+1 on @EDiLD suggestion.
For each workflow we could have the illustration, short description, advantages/disadvantages, and examples to be downloaded and run.
I agree. I was thinking the same thing. The workflows need a bit of description with them. The colored dots was a way to try to incorporate folder structure, which is keyed in EDiLD's example. Maybe each workflow should have it's own page? There are a few more workflows people are giving me, but I think I will work on them later. I am going to deal with the page structure so I can add children pages to the workflow section.
+1 on the separate pages
@jhollist I can confirm the latest version of RStudio works like that, effortless rmd to pdf/html/docx mostly thanks to the new version of the rmarkdown package
Also, here's another workflow/file structure diagram from Christopher Gandrud's book Reproducible Research with R and RStudio (very cool that the entire book can be reproduced!):
@EDiLD your suggestion about externalised code is very useful and we should point to some ways that R enables that (source
ing r script, read_chunk
, having child files and so on)
@benmarwick , I can also confirm. I grabbed the preview version this morning just to try it out. Pretty cool and might be enough to get a few more point-and-clickers on board...
And I'd add to the externalised code list, creating a separate R package for the code. Similar to source
of course, but forces a bit more structure on it. I have two papers I am working on now that we are trying this. All analysis in an R package (including figures) and simply loading the package in a code chunk and then calling the appropriate function in the flow of the manuscript.
+1 for research-project-as-R-package, an excellent idea that encourages structured documentation of the code, and tests can help ensure the code actually works. Gentleman and Temple Lang make a great case for this here, calling it a compendium.
Christopher Gandrud's workflow is badass!
For a detailed look at folder structure that follows any of the workflows we use, we can also provide Github repo links that use that structure.
I will make another workflow that is super simple using R Studio, in the description I can specify that this is the easiest. The only reason I see that people wouldn't want to use this is if they want specific styling for the PDF/HTML output. Any other reason you guys can think of?
Good plan, I agree about RStudio being the simplest, and probably the most suitable for those getting starting with all this (is that our target audience?). Folks passionately committed to other environmentals (emacs, etc.) would likely have strong views otherwise.
@iamciera Do you have the preview version of RStudio (released yesterday!)? That has all the fun pandoc integration.
And, just thinking out loud here, but I would bet that you can also control the styling of the output for the .docx and .pdf. Since RStudio is using pandoc, they likely have template .docx or .sty files to control the styling.
One downfall to the RStudio is automation, but if that is a concern for a user, they probably would be using a different workflow to begin with!
In any event, I agree that it is the easiest option and most certainly a good one to include.
No I did not! I will update now my RStudio now and mess around with it a bit. There are Rstudio people here who will def know if this is possible (the styling), I will ask them at lunch.
@jhollist, you are right though, we should strive to have the simplest workflows because the people who are interested with little automation are most likely making their own workflows anyway. That is something we should keep in mind through out all of this. The people who we really want to reach with this guide is people new to thinking in this way, so the simpler the better, for everything.
@iamciera +100 on "The people who we really want to reach with this guide is people new to thinking in this way"
Oh, and let me know what you find out about the styling. I dug around a bit, but didn't see anything obvious.
Yes, sourcing R-scripts would make the .Rmd more readable and also maintainance of code should be easy, without touching the paper. Why, didn't I though of this???...
It would be nice to have the workflows easily generated so in the future people can add their own, ideally with all of them matching in format and color coding. I can generate them fast in Illustrator, but this would not help long term.
Another will be collecting the workflows.