Closed maelle closed 4 years ago
What do you mean by pandoc vs markdown formatted table?
This was something that made me want to look into it so it decided to dig a bit into it.
I think what you observe it not as simple as described. Here is the step of my investigation
First thing I tried was to render a very small document with hugo_document
format
---
title: "test"
output: hugodown::hugo_document
---
```{r}
knitr::kable(head(mtcars, 2))
to render with `rmarkdown::render("test.Rmd", clean = FALSE)` so that I get the intermediary knitted files.
The pandoc command line use is this one.
```sh
"C:/Users/chris/scoop/apps/rstudio/current/bin/pandoc/pandoc" +RTS -K512m -RTS test.utf8.md --to markdown_strict+pipe_tables+strikeout+autolink_bare_uris+task_lists+backtick_code_blocks+definition_lists+footnotes+smart+tex_math_dollars --from markdown+autolink_bare_uris+tex_math_single_backslash --output test.md --wrap=none
If I open the intermediary file (test.utf8.md
, the one resulting of knit::knitr
step), I can see this table
mpg cyl disp hp drat wt qsec vs am gear carb
-------------- ---- ---- ----- ---- ----- ------ ------ --- --- ----- -----
Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
If I open the resulting file test.md
, I see a markdown table looking like this
mpg cyl disp hp drat wt qsec vs am gear carb
-------------- ---- ---- ----- ---- ----- ------ ------ --- --- ----- -----
Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
so not converted, whereas the command line as the correct extensions
I wanted to test if the post processing could be the cause, so I used the pandoc command line directly inside the folder where the test.Rmd
and the intermediary files are run the previous command line.
You'll get a document (test.md
) with this table format (NOTE: it will erase the previous document)
| | mpg| cyl| disp| hp| drat| wt| qsec| vs| am| gear| carb|
|---------------|----:|----:|-----:|----:|-----:|------:|------:|----:|----:|-----:|-----:|
| Mazda RX4 | 21| 6| 160| 110| 3.9| 2.620| 16.46| 0| 1| 4| 4|
| Mazda RX4 Wag | 21| 6| 160| 110| 3.9| 2.875| 17.02| 0| 1| 4| 4|
It is the correct one. The issue seems to come from the post_processor in hugo_document
.
When I look at it, I think the issue comes from here https://github.com/r-lib/hugodown/blob/51d365b5b4ad4b7146120dde323ab742d606c97e/R/hugo-format.R#L88-L92
Currently, the input_file
is read by brio::read_line
, modified and written back by brio::write_line
into the output_file. In rmarkdown / knitr ecosystem, at the preprocessor step, input_file
is the one passed as input to pandoc, and output_file
the one resulting from pandoc conversion. It allows the post processor function to get access to both to do some postprocessing.
Here, it means that the file before pandoc conversion is read, modified and its body is written back into the file after pandoc conversion. This is why the pipe_tables
format is not there and we get the pandoc format from the input file, because the whole output file body is replaced by the input file one. The body should be the one from the output file from pandoc, right ?
If I am right, I am wondering why this has not caused any more weird rendering error. 🤔
As the aim is to preserve the yaml and adding some content in it, I think it should be
meta <- yaml::as.yaml(yaml)
body <- brio::read_lines(output_file)
output_lines <- c("---", meta, "---", "", body)
brio::write_lines(output_lines, output_file)
This is like in rmardown::md_document
post processor
Doh, thanks for the investigation!!
@cderv btw it's not too surprising this wasn't causing major issues since it you'd only see problems where goldmark and pandoc disagree on formatting. And they both start from commonmark (which covers the most common syntax), so you'd only expect to see weirdness with more exotic syntax.
Wow, fantastic digging @cderv!
When I use
knitr::kable()
in a hugodown post, in the index.md I get a pandoc-formatted table, not a markdown-formatted table. I'm not sure why. :-)I looked into how
kable()
defines its default format https://github.com/yihui/knitr/blob/683887b3169104592f3dbabb457e41aaee2af71c/R/table.R#L91-L104, the only switch seems to be a global option.I see there's https://github.com/r-lib/hugodown/blob/51d365b5b4ad4b7146120dde323ab742d606c97e/R/hugo-format.R#L126 but it doesn't seem to be used? Or am I missing something?