soztag / fossos

seminar series on data science, reproducible science and open source by @maxheld83
https://datascience.phil.fau.de/fossos/
Creative Commons Attribution Share Alike 4.0 International
12 stars 3 forks source link

factor out pandoc calls for non-r projects #158

Closed maxheld83 closed 5 years ago

maxheld83 commented 5 years ago

This workflow is gaining some traction in academia and beyond, using the tools we're using in this class:

This is readily available in all the rmarkdown formats, and elegantly wrapped by several Rmarkdown-related R packages. But in projects that don't use R, it's always a little burdensome/awkward to carry around R dependencies. (If you do use R in your project, you definetely want RMarkdown for the knitr support).

For those non-R technical/academic writing, Pandoc already fits the bill and does everything you need. The problem is just that it's pretty hard to do that. You need to futz around with Pandoc templates, and pretty long Pandoc calls to get the same results that, say rmarkdown::render_document() gets you out of the box. It might be worthwhile to

  1. document how to do this well (without R)
  2. even factor this out into its own small little repository (open source project!) with a bunch of bash scripts which call Pandoc accordingly, with appropriate templates etc.

References to get started on this are essentially the pandoc documentation (a lot of it), and the internals of how Rmarkdown works, cause you can probably get most of the secret sauce from in there; it's just calling Pandoc underneath the hood.

If we wanted to be really elegant about this we might even think about away to fork this out of Rmarkdown, as in to collaborate with Rmarkdown in a way, where the (awesome) Rmarkdown "pandoc defaults" would be factored out to a separate bash script package, which Rmarkdown then calls as defaults. And non-R-users can just use the bash script.

So, this can basically go anywhere from graded-3 to graded 30 :).

maxheld83 commented 5 years ago

obviously, we'd need to check this again to make sure this doesn't already exist in some form. Don't reinvent the wheel!

Also, there is a chance that JUST plain pandoc is the answer. Sometimes things don't need another wrapping layer.

samshaffer97 commented 5 years ago

there already is a bash script that converts .md files to LaTex pdf documents https://github.com/FHefner/pandoc-docs might be interesting to look at this and work on a bash script that converts into other file types as well, if that doesn't exist already

samshaffer97 commented 5 years ago

when you google scientific writing pandoc there are several websites that have pretty good documentation and tips on how to use pandoc and LaTex without too much hassle and without having to use any R

maxheld83 commented 5 years ago

great, I think we can close this then, right? Feel free to open @samshaffer97 or anyone, if you feel this needs.