Rapporter / pander

An R Pandoc Writer: Convert arbitrary R objects into markdown
http://rapporter.github.io/pander/
Open Software License 3.0
294 stars 66 forks source link

Incorrect table cell width with Unicode soft-hyphen character #368

Open billdenney opened 8 months ago

billdenney commented 8 months ago

This may be related to #18, #280, or #296

When I add several Unicode soft-hyphen characters (\u00ad), pandoc.table.return() misaligns the text in the cells after the soft-hyphen. My specific example has the issue in the row names.

library(pander)

d <-
  data.frame(
    a = LETTERS[1:2],
    b = paste(letters, collapse = "")
  )
# insert soft hyphen
rownames(d) <- c("a", "w\u00adx\u00ady\u00adz")

# Second row of table does not align
cat(
  pandoc.table.return(d)
)
#> 
#> ----------------------------------------------
#>    &nbsp;      a               b              
#> ------------- --- ----------------------------
#>     **a**      A   abcdefghijklmnopqrstuvwxyz 
#> 
#>  **w­x­y­z**   B   abcdefghijklmnopqrstuvwxyz 
#> ----------------------------------------------

Created on 2023-12-09 with reprex v2.0.2

The soft-hyphen character is detected as having a width of 1 with nchar(), but it appears to result in a width of zero later in the process. My guess is that difference is the source of the issue.