Rapporter / pander

An R Pandoc Writer: Convert arbitrary R objects into markdown
http://rapporter.github.io/pander/
Open Software License 3.0
294 stars 66 forks source link

Let "grid"-styled `pandoc.table`s accept empty lines in cells #357

Open DaniMori opened 2 years ago

DaniMori commented 2 years ago

In the following example

library(pander)

sample_df <- data.frame(
  case  = c(
    "A very large text line that will be split into two lines within a cell",
    "Short text"
  ),
  lines = c("multiple lines\n\nwith line breaks", "more\n\nline breaks")
)

sample_df |> pandoc.table(
    style            = "grid",
    justify          = "ll",
    split.tables     = Inf,
    keep.line.breaks = TRUE
  )
#> 
#> 
#> +------------------------------+------------------+
#> | case                         | lines            |
#> +==============================+==================+
#> | A very large text line that  | multiple lines   |
#> | will be split into two lines | with line breaks |
#> | within a cell                |                  |
#> +------------------------------+------------------+
#> | Short text                   | more             |
#> |                              | line breaks      |
#> +------------------------------+------------------+

Created on 2022-04-19 by the reprex package (v2.0.1)

Session info ``` r sessioninfo::session_info() #> - Session info --------------------------------------------------------------- #> setting value #> version R version 4.1.3 (2022-03-10) #> os Windows 10 x64 #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate Spanish_Spain.1252 #> ctype Spanish_Spain.1252 #> tz Europe/Paris #> date 2022-04-19 #> #> - Packages ------------------------------------------------------------------- #> ! package * version date lib source #> P cli 3.0.1 2021-07-17 [?] CRAN (R 4.1.0) #> P digest 0.6.27 2020-10-24 [?] CRAN (R 4.1.0) #> P evaluate 0.14 2019-05-28 [?] CRAN (R 4.1.0) #> P fastmap 1.1.0 2021-01-25 [?] CRAN (R 4.1.0) #> P fs 1.5.0 2020-07-31 [?] CRAN (R 4.1.0) #> P glue 1.4.2 2020-08-27 [?] CRAN (R 4.1.0) #> P highr 0.9 2021-04-16 [?] CRAN (R 4.1.0) #> P htmltools 0.5.2 2021-08-25 [?] CRAN (R 4.1.1) #> P knitr 1.33 2021-04-24 [?] CRAN (R 4.1.0) #> P magrittr 2.0.1 2020-11-17 [?] CRAN (R 4.1.0) #> P pander * 0.6.5 2022-03-18 [?] CRAN (R 4.1.3) #> P Rcpp 1.0.7 2021-07-07 [?] CRAN (R 4.1.0) #> P reprex 2.0.1 2021-08-05 [?] CRAN (R 4.1.1) #> P rlang 0.4.11 2021-04-30 [?] CRAN (R 4.1.0) #> P rmarkdown 2.10 2021-08-06 [?] CRAN (R 4.1.1) #> P rstudioapi 0.13 2020-11-12 [?] CRAN (R 4.1.0) #> P sessioninfo 1.1.1 2018-11-05 [?] CRAN (R 4.1.0) #> P stringi 1.7.4 2021-08-25 [?] CRAN (R 4.1.1) #> P stringr 1.4.0 2019-02-10 [?] CRAN (R 4.1.0) #> P withr 2.4.2 2021-04-18 [?] CRAN (R 4.1.0) #> P xfun 0.25 2021-08-06 [?] CRAN (R 4.1.1) #> P yaml 2.2.1 2020-02-01 [?] CRAN (R 4.1.0) #> #> [1] C:/Users/Mori.P16/OneDrive - UAM/Workspace/R_bug_reporting_and_help/renv/library/R-4.1/x86_64-w64-mingw32 #> [2] C:/Users/Mori.P16/AppData/Local/Temp/RtmpcTg76S/renv-system-library #> [3] C:/Program Files/R/R-4.1.3/library #> #> P -- Loaded and on-disk path mismatch. ```

It can be seen that pander.table strips off extra newlines in between a character value. Thus it gives the following result:

+------------------------------+------------------+
| case                         | lines            |
+==============================+==================+
| A very large text line that  | multiple lines   |
| will be split into two lines | with line breaks |
| within a cell                |                  |
+------------------------------+------------------+
| Short text                   | more             |
|                              | line breaks      |
+------------------------------+------------------+

Where I would like to have the following:

+------------------------------+------------------+
| case                         | lines            |
+==============================+==================+
| A very large text line that  | multiple lines   |
| will be split into two lines |                  |
| within a cell                | with line breaks |
+------------------------------+------------------+
| Short text                   | more             |
|                              |                  |
|                              | line breaks      |
+------------------------------+------------------+

Would it be possible to have it work like that, or have an additional parameters that allows to turn on/off the extra-newline stripping?

daroczig commented 2 years ago

Hm, I'm not exactly sure top of my head why is it working that way and if it was intentional .. but I wonder if using a hard linebreak with backslash might be a good enough workaround?

sample_df <- data.frame(
  case  = c(
    "A very large text line that will be split into two lines within a cell",
    "Short text"
  ),
  lines = c("multiple lines\n\\\nwith line breaks", "more\n\nline breaks"))

sample_df |> pandoc.table(
    style            = "grid",
    justify          = "ll",
    split.tables     = Inf,
    keep.line.breaks = TRUE
  )

Resulting in

+------------------------------+------------------+
| case                         | lines            |
+==============================+==================+
| A very large text line that  | multiple lines   |
| will be split into two lines | \                |
| within a cell                | with line breaks |
+------------------------------+------------------+
| Short text                   | more             |
|                              | line breaks      |
+------------------------------+------------------+
DaniMori commented 2 years ago

That's a nice workaround indeed! It's funny that it doesn't actually separate the lines in two paragraphs, but nevertheless, it inserts a "line break" which works just as well. And by the way, it works just as fine without the first "newline" character, i.e.:

sample_df <- data.frame(
  case  = c(
    "A very large text line that will be split into two lines within a cell",
    "Short text"
  ),
  lines = c("multiple lines\\\nwith line breaks", "more\n\nline breaks"))

There is just a slight difference in the line spacing when rendering to HTML, as you can see here:

HTML output compared, showing that the line spacing in the output is smaller than in the expected result

But for me it would do anyway. I'm not sure if the maintainers would actually want to check out the "newline" thing anyway, so I'll leave the issue open, just in case.

DaniMori commented 2 years ago

For the record, why and when this happens seems rather difficult to track down. I lose track in the call to tableExpand_cpp, which I understand calls a compiled C++ library. This function gets a table row as input and returns it directly rendered as markdown (with the "newlines" stripped), so the solution seems to be therein.