rstudio / rmarkdown

Dynamic Documents for R
https://rmarkdown.rstudio.com
GNU General Public License v3.0
2.88k stars 975 forks source link

have render(...) infer output_format when output_file specified #1569

Closed pearsonca closed 3 years ago

pearsonca commented 5 years ago

By filing an issue to this repo, I promise that


This is a suggestion, to solve a minor annoyance for me and maybe help some other people.

I mostly use R from the command line, rather than via Rstudio. I am working on integrating some rmarkdown report generation in a make-based approach:

report-%.pdf: report.Rmd # ... some other dependencies set by % wildcard
    R -e "rmarkdown::render('$<', output_file='$@', output_format='pdf_document')" --args $(filter-out $<,$^)

Translating a bit for those less familiar with make: this specifies how to a create a target file named report-X.pdf (or report-Y.pdf, report-something.pdf, etc) from report.Rmd and other dependencies. $< resolves to report.Rmd and $@ to the name of the target. I was slightly surprised to discover that I had to manually specify output_format=... for this work. When unspecified, I get the error:

> rmarkdown::render('report.Rmd', output_file='report-test.pdf')

processing file: report.Rmd
  |.................................................................| 100%
  ordinary text without R code

output file: report.knit.md

/usr/bin/pandoc +RTS -K512m -RTS report.utf8.md --to html4 --from markdown+autolink_bare_uris+ascii_identifiers+tex_math_single_backslash+smart --outputreport-test.pdf --email-obfuscation none --self-contained --standalone --section-divs --template /home/carl/libs/Rpackages/rmarkdown/rmd/h/default.html --no-highlight --variable highlightjs=1 --variable 'theme:bootstrap' --include-in-header /tmp/RtmppWfeua/rmarkdown-str602f78928780.html --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --metadata pagetitle=report.utf8.md
cannot produce pdf output from html4
Error: pandoc document conversion failed with error 1
Execution halted
make: *** [Makefile:32: report-test.pdf] Error 1

Obviously, I have the solution to avoid that error. However, I'd like my make rule to be:

report-%.pdf report-%.html report-%.docx: report.Rmd # ... some other dependencies set by % wildcard
    R -e "rmarkdown::render('$<', output_file='$@')" --args $(filter-out $<,$^)

This would allow the user to build whichever of the desired target types, and keep generation logic concise, consistent, etc. That approach would maintain the inference logic within the library, where details like docx==word are more readily known and minimize future external changes if there are tweaks to internal rmarkdown definitions.

Seems like something akin to

# ...
if (missing(output_format) & !missing(output_file)) {
  output_format <- sprintf("%s_document", gsub("doc.*","word",gsub("^.+\\.(pdf|doc.*|html)$","\\1",output_file)))
}
# ...

in the render logic could do the trick, but I haven't dug into the details to see what other conflicts there might be, etc.

cderv commented 5 years ago

By default, the rmarkdown::render function uses the first format defined in the YAML header if no output_format is specified. I believe if you want to render pdf, you would have pdf_document in the yaml header. Is this the case ? What are the content of the yaml header in your Report.Rmd file ?

Your example is not fully reproductible as we don't have this file. Thanks!

pearsonca commented 5 years ago

In my setup, the YAML head deliberately does not specify an output_format. The test file that got those results (both not working nothing supplied, and working when pdf_document supplied) is literally:

---
---

Lorem.
cderv commented 5 years ago

Ok so in this case, as mentionned by doc and seen in the code, when no output_format is provided in the YAML header, the default is html_document. Hence the error.

I don't know if anyone relies on this default. Infering format from output_file when no format provided is nice. I like the idea but don't know either the potential conflicts. 🤔

dushoff commented 4 years ago

As far as I can tell, the described behaviour is that when output_file is a pdf type and output_format is not specified, knit produces an html file and feeds it to pandoc, which then chokes.

This doesn't seem like the sort of thing robust code should be relying on.

I would find the suggested change to be convenient, personally, for exactly the reasons described by the suggester.

github-actions[bot] commented 3 years ago

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.