Code not properly displayed for R lessons using Jekyll

jdblischak commented 10 years ago

Our current strategy for maintaining the R lessons is to have a R Markown (Rmd) file with the source code and also the Markdown (md) file generated ("knit") from the Rmd file. The YAML header in the Rmd is recreated in the md file so that Jekyll can properly create an html file.

Here is an example of the result of this process. It has multiple issues that detract from the presentation of the material.

Each code line begins with the letter r, e.g. r x <- 1 + 1
Code blocks (and output) get condensed to one line
No syntax highlighting

These problems are all fixed if the html file is generated with knit2html. Unfortunately this strips the YAML header and is thus no longer recognized by Jekyll.

Does anyone have any simpler solutions to this problem?

I had come up against this problem before when trying to display my research results using a combination of knitr and Jekyll. My hacky solution was to knit all the files with knit2html and then re-add the YAML header in a Makefile. In this case the line would be something like this:

sed -i 1i'---\nlayout: lesson\nroot: ../..\n---\n\n' *html

This is on the right track. When the html file is knit, it adds some custom CSS which clashes with some of the SWC defaults (e.g. it is no longer surrounded by the grey/blue border).

So my current proposed workflow is as follows:

Write lesson in Rmd.
Run a Makefile that knits an html file, adds the YAML to the html file, and deletes the md file.
Commit the Rmd and html file to the bc repo.

What do people think? Does anyone have a simpler solution? Also, can someone with more knowledge of CSS figure out which custom CSS options we need to pass to knit2html to make it fit better with the SWC theme?

gvwilson commented 10 years ago

I think your solution is the cleanest we're going to come up with - and it allows us to control the CSS (via Jekyll template expansion), so we have at least a shot at consistent look and feel.

jdblischak commented 10 years ago

I figured out how to solve the problem with the conflicting CSS. The default CSS added by knitr can be avoided by passing the argument stylesheet = "" to knit2html. So the basic workflow for one file would be the following:

Rscript -e 'library("knitr"); knit2html("01-starting-with-data.Rmd", stylesheet = "")'
rm 01-starting-with-data.md
sed -i 1i'---\nlayout: lesson\nroot: ../..\n---\n\n' 01-starting-with-data.html

This can be generalized for all the lessons in the Makefile. @ramnathv, you had volunteered to create the Makefile for the R lessons? Would you be able to incorporate this code?

One more complication is that the root directory specified in the YAML header can change based on the location of the file in the repo. Thus an improvement to the above code would be to read the YAML in the original R Markdown file and add that to the HTML file (the above code assumes each file is located in novice/r).

cboettig commented 10 years ago

Hey @jdblischak Just saw this mentioned when joining the mozillasprint IRC room. What you're doing here is very cool, but also looks like a rather more complicated workflow than necessary?

Ideally I believe one should just be able to put .Rmd files in the _posts directory of a Jekyll blog, add the appropriate plugin, and have everything just work -- R compiles the Rmd to the appropriate markdown and Jekyll users the desired markdown parser to generate the html with appropriate CSS. (At the ropensci hackathon in April a few of us worked up a proof of principle of this: https://github.com/ropensci/docs)

Short of that, it seems more logical to me to have the rmarkdown/knitr generate pure markdown output and let jekyll crank on that the way it would for other posts. The sed -i hacking the yaml back in feels suboptimal to me?

jdblischak commented 10 years ago

Thanks for the suggestion, @cboettig. I totally agree that my hack is not the ideal situation. I am going to check out your example now.

cboettig commented 10 years ago

Yeah, the solution in https://github.com/ropensci/docs might be a bad if you don't want to re-run all the .Rmd files on build. That provides a strong test for reproducibility, but for a big site that could become a real nuisance...

I think a good intermediate solution would be to render in the Rmd to markdown directly with knitr using a makefile, and then let Jekyll take over the role of rendering the code. (Just calling knit means that R doesn't mess with the yaml header in the first place, unlike rmarkdown, as you probably know already).

It looks like part of the challenge is that you are using kramdown as the markdown parser, and that Knitr generates markdown code blocks that are not kramdown compatible. You might consider switching to redcarpet as the markdown parser to support github-flavored-markdown, or consider using pandoc as the parser (just another jekyll plugin, as we do in https://github.com/ropensci/docs ), which is probably the prefered flavor anyhow.

The only disadvantage of using pandoc is that it's not one of the built-in flavors, so Github can't compile the site automatically. In docs We work around this by just kicking the job over to travis, which installs pandoc, compiles the site and pushes it back to Github. This way the site is still automatically deployed on commit. One could of course just build locally and then push the _site.

jdblischak commented 10 years ago

Right, so our previous solution was knitting to markdown, which retains the yaml, and then letting jekyll render to html. This does not give good results, as the example I linked to in my original post shows.

We use kramdown and not redcarpet to render because it properly renders our input and output blocks from markdown that was generated from IPython notebooks (#183).

I don't know about using travis+pandoc as the parser. @gvwilson, is this an option? Or will that break too many other things?

The current workflow is to have instructors pull the gh-pages from bc, make changes, and then the site is automatically built when they push to GitHub. Building locally and then pushing _site to gh-pages would be one more extra step. We could propose this as an alternative solution and see what others think. I don't know how to do it, but I know there is a way to automate pushing to a different branch. So the new workflow would be to pull from bc into the bootcamp master branch, make edits, and then run a Makefile which both builds the site locally and then pushes the rendered site to the gh-pages branch.

cboettig commented 10 years ago

@jdblischak Ah, thanks for the explanations, this does all seem to come down to markdown flavor conflicts.

Yup, I hear you on building locally being a problem. The travis solution mostly alleviates that, since the .travis.yml file would just be grabbed by part of the git clone. You might need still need to tick travis on for each repo though(?) which would be a pain.

I see; never realized redcarpet didn't have an option to permit markdown parsing within a div element. (https://github.com/vmg/redcarpet/issues/13)

If you want to stick with kramdown as the parser, why not just tell knitr to use kramdown-compatible syntax for code blocks?

 hook.t <- function(x, options) stringr::str_c("\n\n~~~\n", x, "~~~\n\n")
 hook.r <- function(x, options) { 
   stringr::str_c("\n\n~~~ ", tolower(options$engine), "\n", x, "\n~~~\n\n")
}
knitr::knit_hooks$set(source=hook.r, output=hook.t, warning=hook.t,
                              error=hook.t, message=hook.t)

(Most markdown parsers recognize that syntax anyhow)

jdblischak commented 10 years ago

Great idea, @cboettig! I hadn't used knitr hooks before. Here are my thoughts:

One thing I don't like about this approach is that it breaks up code blocks. For example, instead of displaying as one contiguous code block like below:

x <- 3
y <- 4
z <- x + y

It displays each line as its own code block, i.e.

x <- 3

y <- 4

z <- x + y

Do you know if there is a way to change this behavior?

For the sake of a consistent style, we can change the hooks so that the code in the R lessons renders exactly as for the Python lessons.

hook.in <- function(x, options) stringr::str_c("\n\n<pre class='in'><code>",
                                               x, "</code></pre>\n\n")
hook.out <- function(x, options) { 
  stringr::str_c("\n\n<div class='out'><pre class='out'><code>", x,
                 "</code></pre></div>\n\n")
}
knitr::knit_hooks$set(source=hook.in, output=hook.out, warning=hook.out,
                      error=hook.out, message=hook.out)

I am having an issue that I didn't think was a problem before. When I have the yaml header in both the Rmd file and the md file, jekyll is automatically rendering the Rmd file instead of the md file. I had to delete the yaml in the Rmd file after knitting in order to build the site locally. Is this happening for others as well? I am thoroughly confused because I thought I had this working before.

ramnathv commented 10 years ago

@jdblischak Try adding the following to _config.yml to exclude Rmd files from being built by Jekyll

exclude: ["*.Rmd"]

You can refer to the documentation here

As for getting contiguous code blocks, try setting the chunk option collapse = TRUE globally. Here is an example that illustrates it use.

jdblischak commented 10 years ago

Thanks, @ramnathv. Properly configuring the _config.yml fixed my problem with the Rmd files being rendered.

However, collapse did not work. Each line of code is still getting wrapped in its own tag by the hook. Continuing my example above, the markdown looks like this:

<pre class='in'><code>x <- 3</code></pre>

<pre class='in'><code>y <- 5</code></pre>

<pre class='in'><code>x + y</code></pre>

<div class='out'><pre class='out'><code>[1] 8
</code></pre></div>

ramnathv commented 10 years ago

That is strange. Try replacing the x in the hook code with paste(x, collapse = "\n") for a quick workaround.

cboettig commented 10 years ago

Sorry, that's probably my bad, note that you may need to collapse inside the hook:

 hook.t <- function(x, options) paste0("\n\n~~~\n", paste0(x, collapse="\n"), "~~~\n\n")
 hook.r <- function(x, options) { 
   paste0("\n\n~~~ ", tolower(options$engine), "\n", paste0(x,collapse="\n"), "\n~~~\n\n")
}
knitr::knit_hooks$set(source=hook.r, output=hook.t, warning=hook.t,
                              error=hook.t, message=hook.t)

jdblischak commented 10 years ago

Great, using paste0(x, collapse="\n") worked. Thanks for all the help @cboettig and @ramnathv.

jdblischak commented 10 years ago

Before I attempt another hacky solution, is there a good way to handle the fact that we are trying to knit the files from a different directory? The help file for knit specifically warns against trying this:

If the output argument is a file path, it is strongly recommended to be in the current working directory (e.g. ‘foo.tex’ instead of ‘somewhere/foo.tex’), especially when the output has external dependencies such as figure files.

Ideally I would like a solution where you could manually knit the file by running knit(file.Rmd) from the same directory as the file is contained (e.g. if you iteratively are testing some new changes), but also be able to be knit by running the Makefile from the root of the bc repo.

The main problem is that the figures get written to a directory called figure in the directory where the command is called. If I set fig.path = "novice/r/figure", then the files are created in the correct place, but they are not rendered because the file is trying to import novice/r/figure instead of figure. I think one option could be to have the Makefile change directories before knitting the file, but I am having difficulty getting that to work. Any ideas?

wking commented 10 years ago

On Wed, Jul 23, 2014 at 05:12:46PM -0700, John Blischak wrote:

I think one option could be to have the Makefile change directories before knitting the file, but I am having difficulty getting that to work. Any ideas?

What about something like:

%.md: %.Rmd cd $$(dirname $<) && \ knit $$(basename $<) > $$(basename $@)

I'm not sure what the knit invocation should actually look like, but the above should invoke it from the source-local directory.

cboettig commented 10 years ago

All this is possible just by configuring knitr appropriately. See package options like base.dir and base.url http://yihui.name/knitr/options#package_options

Carl Boettiger http://carlboettiger.info

sent from mobile device; my apologies for any terseness or typos On Jul 23, 2014 5:31 PM, "W. Trevor King" notifications@github.com wrote:

On Wed, Jul 23, 2014 at 05:12:46PM -0700, John Blischak wrote:

I think one option could be to have the Makefile change directories before knitting the file, but I am having difficulty getting that to work. Any ideas?

What about something like:

%.md: %.Rmd cd $$(dirname $<) && \ knit $$(basename $<) > $$(basename $@)

I'm not sure what the knit invocation should actually look like, but the above should invoke it from the source-local directory.

— Reply to this email directly or view it on GitHub https://github.com/swcarpentry/bc/issues/524#issuecomment-49954269.

jdblischak commented 10 years ago

Thanks for the ideas, @wking and @cboettig. I really appreciate it.

What about something like:

%.md: %.Rmd cd $$(dirname $<) && \ knit $$(basename $<) > $$(basename $@)

This is very similar to what I was initially trying without success. The problem is that knit is not a command line function. It has to be called from within R. This can be acheived by running Rscript with the -e flag to pass it a string. The problem is that the shell variables are not being properly evaluated inside of the string that is passed to R. For example, the code below does not work:

%.md: %.Rmd cd $$(dirname $<) && \ Rscript -e "knitr::knit($$(basename $<), output =$$(basename $@)"

Do I need to do something special to get the basename calls to work when they are inside parentheses?

All this is possible just by configuring knitr appropriately. See package options like base.dir and base.url

This runs into the same problem as I was having with setting fig.path. I can get the figures to be saved in the correct location using either fig.path or base.dir. However, then the Markdown file tries to search for the image in the wrong place, e.g. it is searching for the image in novice/r/figure when it is already in novice/r. Am I not using base.dir and fig.path correctly? I am having trouble seeing how using either of these settings will allow me to knit from the root directory or from novice/r and get the same result.

cboettig commented 10 years ago

Sorry I don't think I quite got what you wanted to do. Using base.dir and base.url etc should be able to handle knitting from a different location, but you'd need different base.url values when knitting from different locations; i.e Rscript -e "knitr::opts_chunk$set(base.url =path); knit(...".

Weird, I've had no trouble using make file macros in Rscript before; https://github.com/cboettig/template/blob/master/manuscripts/Makefile But my make knowledge is limited...

Carl Boettiger http://carlboettiger.info

sent from mobile device; my apologies for any terseness or typos On Jul 23, 2014 7:33 PM, "John Blischak" notifications@github.com wrote:

Thanks for the ideas, @wking https://github.com/wking and @cboettig https://github.com/cboettig. I really appreciate it.

What about something like:

%.md: %.Rmd cd $$(dirname $<) && \ knit $$(basename $<) > $$(basename $@)

This is very similar to what I was initially trying without success. The problem is that knit is not a command line function. It has to be called from within R. This can be acheived by running Rscript with the -e flag to pass it a string. The problem is that the shell variables are not being properly evaluated inside of the string that is passed to R. For example, the code below does not work:

%.md: %.Rmd cd $$(dirname $<) && \ Rscript -e "knitr::knit($$(basename $<), output =$$(basename $@)"

Do I need to do something special to get the basename calls to work when

they are inside parentheses?

All this is possible just by configuring knitr appropriately. See package options like base.dir and base.url

This runs into the same problem as I was having with setting fig.path. I can get the figures to be saved in the correct location using either fig.path or base.dir. However, then the Markdown file tries to search for the image in the wrong place, e.g. it is searching for the image in novice/r/figure when it is already in novice/r. Am I not using base.dir and fig.path correctly? I am having trouble seeing how using either of these settings will allow me to knit from the root directory or from novice/r and get the same result.

— Reply to this email directly or view it on GitHub https://github.com/swcarpentry/bc/issues/524#issuecomment-49961030.

wking commented 10 years ago

On Wed, Jul 23, 2014 at 07:33:04PM -0700, John Blischak wrote:

The problem is that the shell variables are not being properly evaluated inside of the string that is passed to R. For example, the code below does not work:

%.md: %.Rmd cd $$(dirname $<) && \ Rscript -e "knitr::knit($$(basename $<), output =$$(basename $@)"

Do I need to do something special to get the basename calls to work when they are inside parentheses?

I think you're missing quotes around the filenames and the closing paren for knit(). How about:

%.md: %.Rmd cd $$(dirname $<) && \ Rscript -e "knitr::knit('$$(basename $<)', output='$$(basename $@)')"

jdblischak commented 10 years ago

I think you're missing quotes around the filenames and the closing paren for knit(). How about:

%.md: %.Rmd cd $$(dirname $<) && \ Rscript -e "knitr::knit('$$(basename $<)', output='$$(basename $@)')"

Good eye, @wking. Those were exactly the problems.

jdblischak commented 10 years ago

Thanks everyone for your help. This was fixed with #626.

swcarpentry / DEPRECATED-bc

Code not properly displayed for R lessons using Jekyll #524

they are inside parentheses?