Booksprint to write rOpenSci documentation

mfenner commented 10 years ago

I would be interested to work on generating some documentation in book form - ideally using markdown/knitr/jekyll/Github pages. Some examples include:

I am involved in the Opening Science book project and will take part in a booksprint in early March.

karthik commented 10 years ago

I'd be up for this. You might have noticed that we're also writing one on Open Science with R and have yet to settle on a workflow.

Can you tell me more about your booksprint?

mfenner commented 10 years ago

The booksprint is in German around "Open Science" and we will use a variation of Mediawiki. I'm taking part as an author, and I'm not so much involved in the technology side.

I think for your hackathon an "rOpenSci cookbook" might work. Hopefully not too much overlap and maybe easier to get something done in a day or so. We can maybe use the toolchain that Hadley Wickham is using for his Advanced R Programming book - although I recommend to create chapters as "posts" and not "pages" when using Jekyll - makes it easier to use things such as tags.

sckott commented 10 years ago

It definitely couldn't hurt to have more documentation. Thoughts @ropensci/owners ?

karthik commented 10 years ago

I agree @sckott It would be good to have some great documentation that we could also roll into the book at some point.

cboettig commented 10 years ago

:+1:

emhart commented 10 years ago

:clap:

mfenner commented 10 years ago

OK, then I can work on this during the hackathon, hopefully a few others are also interested. This may be the best way I can contribute, as my R skills are not on part with you guys.

karthik commented 10 years ago

Thanks @mfenner We'll be on hand to help and @Dtrap has also expressed interest in working on this.

dtrapezoid commented 10 years ago

Assuredly!

mfenner commented 10 years ago

Are you all OK with knitr/pandoc/jekyll/GH pages as the toolset? If yes, I can set this up in advance (repo, basic templates, plugins, etc.) so that we don't loose any time during the hackathon. Something that I haven't seen yet is a Github hook that triggers pandoc to convert the repo, I currently do this on my local machine.

And where should I start a list of questions/problems to cover (and who wants to work on what qquestion/problem)? Again, would be good to start this before the hackathon starts so that we can jump right in.

sckott commented 10 years ago

@cboettig @karthik any thoughts on this, seems you two have thought more about these workflows than I have

karthik commented 10 years ago

Are you all OK with knitr/pandoc/jekyll/GH pages as the toolset?

Yes, absolutely. It makes sense to set up ahead of time too.

cboettig commented 10 years ago

Yup, this toolchain sounds good to me

Carl Boettiger http://carlboettiger.info

sent from mobile device; my apologies for any terseness or typos On Feb 25, 2014 12:03 PM, "Karthik Ram" notifications@github.com wrote:

Are you all OK with knitr/pandoc/jekyll/GH pages as the toolset?

Yes, absolutely. It makes sense to set up ahead of time too.

Reply to this email directly or view it on GitHubhttps://github.com/ropensci/hackathon/issues/11#issuecomment-36051436 .

mfenner commented 10 years ago

Where should I set up the initial repo (which I want to do before the hackathon to not loose time)? Should this be at rOpenSci or could this be anywhere and we worry about this later?

sckott commented 10 years ago

Doesn't matter for now - We can always transfer it later

karthik commented 10 years ago

I just created a repo and will add you to that team.

mfenner commented 10 years ago

Thanks Karthik. I have added a first version of the book to the repo. You can see the book with three sample chapters here. The basic building blocks are:

Pandoc
Jekyll
Bootstrap 3
Github Pages
Travis CI

Chapters are in the _posts folder as .Rmd files. There is otherwise no integration with knitr yet. Travis CI is used to automatically build the Github Pages site when the master branch of the repo is updated - no need to run Pandoc locally (more info here). The layout, CC license, etc. need your approval, but that can easily be changed in the coming weeks.

This repo should be all we need to get started quickly at the hackathon. I will look at knitr integration and ePub output in the coming weeks.

karthik commented 10 years ago

Thanks Martin! Will check with @sckott and others on how to proceed from here and get back to you.

sckott commented 10 years ago

Looks good @mfenner , I'll take a look at it in the morning.

sckott commented 10 years ago

@mfenner

I wonder if executing code on the travis builds is easy enough to do, or if that is better done locally.
Why are posts as .Rmd files, and not .md files if no code is being executed?
Perhaps we could include code blocks that the reader can play with via opencpu API, though maybe better to keep it simple?

ramnathv commented 10 years ago

@sckott The dynamic examples can be designed independent of the travis build. I will try to put together an example based on one of the rOpenSci packages and we can then see what is the best way to integrate it as a part of this.

cboettig commented 10 years ago

Right, it seems to me the ideal workflow would have the author write in .Rmd and upon each build, the .Rmd files are moved to a cache (could be rendered since they are still in valid Markdown format, but probably makes more sense to ignore them) and the resulting .md version created by knitr would be left as the post. This would avoid knitting every post every time, but still automate the knitting as part of the publishing process. Perhaps there are more clever alternatives to this. Seems like such steps could be managed by the Rakefile, though I don't know my way around Rakefiles as easily as Makefiles...

On Mon, Mar 10, 2014 at 4:07 PM, Ramnath Vaidyanathan < notifications@github.com> wrote:

@sckott https://github.com/sckott The dynamic examples can be designed independent of the travis build. I will try to put together an example based on one of the rOpenSci packages and we can then see what is the best way to integrate it as a part of this.

Reply to this email directly or view it on GitHubhttps://github.com/ropensci/hackathon/issues/11#issuecomment-37245983 .

Carl Boettiger UC Santa Cruz http://carlboettiger.info/

sckott commented 10 years ago

@ramnathv sounds good

karthik commented 10 years ago

@cboettig Sure, that sounds ideal. So very much like a Makefile. But it would also be nice to have some version of a rebuild all to be run periodically so we can make sure all the documentation still works correctly with all of the packages.

@sckott We will have code though, so we need them as Rmd and not md.

I wonder if executing code on the travis builds is easy enough to do, or if that is better done locally.

I don't think we should do this locally. If this is to be automated, and we want to avoid inconsistent versions of tutorials/documentation then we should definitely do this from Travis. We can explore other types of webhooks too (ones we write ourselves on a AWS box).

@ramnathv

The dynamic examples can be designed independent of the travis build.

Thanks for offering an example. What framework are you thinking? OpenCPU? If so we might not be able to go the gh pages route and instead deploy the whole thing to our server or an aws/heroku box.

karthik commented 10 years ago

Also, @ramnathv do you have that link handy for how you implemented dynamic examples inside slidify? I've never seen those work outside of trivial examples (examples with mtcars or iris).

ramnathv commented 10 years ago

The examples in slidify are outdated due to updates in OpenCPU. What I had in mind for OpenCPU is much simpler as seen at http://rcharts.io/playground and http://rcharts.rtfd.org (navigate to the nvd3 page http://rcharts.readthedocs.org/en/latest/nvd3/create.html). With OpenCPU you cannot access data sets from the file system, which is why all examples you have seen make use of built in data sets like mtcars or iris.

ramnathv commented 10 years ago

I do plan to resurrect the OpenCPU widget for slidify, which will make it easy to dynamically convert a Rmd document into an interactive deck with executable examples. However, for now the same effect can be achieved by simply adding the playground html as an iframe. The nice thing about the playground approach is that it is easy to use different backends including gists to allow users to contribute arbitrary examples. Later this week, I will be posting an update to http://rcharts.io/viewer that will add an Edit Me button to every reproducible visualization, which when clicked will take the user to an editable version of the code. I am thinking something along those lines for rOpenSci as well.

UPDATE: This is what I was talking about, regarding editable examples http://rcharts.io/viewer/?9474140. See the bright green edit me button on the top and again at the bottom.

mfenner commented 10 years ago

@sckott I have configured Jekyll to understand .Rmd as markdown, but you can of course also use .md files. I was thinking that it would be nice to be able to open up the files directly in R/RStudio.

@hadley has a nice knitr integration in his adv-r book: https://github.com/hadley/adv-r

I would think that we can use Travis for everything needed here, and I can write a Rake task to automate this. I prefer Rake because jekyll is Ruby and it can be easily integrated - I find it easier than a separate shell script.

hadley commented 10 years ago

My advice is not to worry about caching the md output - re-knitting them every time doesn't take that much time and avoids a wide class of caching related bugs. You might want to try my jekyll Rmarkdown plugin (https://github.com/hadley/adv-r/blob/master/_plugins/rmarkdown.rb) which automatically runs knitr to get .md, then pandoc to get .html.

mfenner commented 10 years ago

Thanks @hadley, I will add your Rmarkdown plugin, and I need to add R to the Travis VM.

hadley commented 10 years ago

@mfenner you might also want to copy what I've done to get a binary pandoc - we're building binary for inclusion into rstudio, and it's way faster than building from source (since you have to build haskell + a lot of prereqs)

mfenner commented 10 years ago

Yes I saw that. Haskell is prebuilt, but installing Pandoc from source takes too long. Is it OK to include the Rstudio Pandoc binary in a solution that is potentially used by a number of other people as well? I already have four Github pages sites using this Travis workflow. Or should I rather find a PPA that has a more recent Pandoc?

hadley commented 10 years ago

I think it's fine. We're committed to maintaining it for the long-term, and that url won't go away.

In the long-run, we need language: R for travis that comes with more of this stuff pre-built. I know @craigcitro started this process, but I can't find the relevant travis issue.

craigcitro commented 10 years ago

indeed, i'm still planning on adding language: R support to travis directly, but it's moving slowly (read: Craig has no spare time right now). until that happens, the r-travis code isn't going anywhere.

the travis ticket to watch: https://github.com/travis-ci/travis-ci/issues/1549

mfenner commented 10 years ago

@craigcitro thanks for the pointer. I'm not that good in R, but I'm a Ruby developer and have a lot of experience writing Chef cookbooks. So let me know if there is something I can help with this.

craigcitro commented 10 years ago

@mfenner that sounds awesome -- i'm hoping that i'll have some time to start looking at this before too long, and i'll totally come a-knockin' for some chef help. (one of my officemates is another ruby/chef expert, so between the two of you, i should be able to fake it. ;) )

mfenner commented 10 years ago

@craigcitro please send me a message when you need some Chef help. Just as Travis I use Chef together with Vagrant, which is a great combination.

@hadley Thanks. I have switched to using your binary Pandoc, which makes Travis run faster by about 8 minutes and the build time is down to 2-4 minutes. The slow part is now installing required Ruby gems. The best way to speed this up (short of caching which Travis only supports for private repos) might be to use bundler and vendor the gems (i.e. include the binaries in the repo). This is also the best practice for deploying Ruby applications to production.

ropensci / unconf14

Booksprint to write rOpenSci documentation #11