Closed GegznaV closed 5 years ago
@RomanTsegelskyi any ideas regarding the Rcpp error message when using R 3.4.0?
I can confirm this bug in Spanish locale, but I don't get an error, it's just wrongly encoded. The bug disappers in R 3.3.3
For me it is the oposite: colnames and rownames work fine, but the data is incorrectly encoded in output.
This only happens in Windows, and it appears to happen both in R 3.4.0
and 3.4.1
. It also does not happen if I switch back to 3.3.3
.
> df <- cbind("á" = "á", "é" = "é", "ç" = "ç")
> rownames(df) <- "ã"
> pander::pander(df)
-------------------
á é ç
------- --- --- ---
**ã** á é ç
-------------------
I have access to a linux box, and it does not happen in linux (which uses UTF-8).
I tried setting Encoding()
and it still comes out wrong (albeit differently).
> Encoding(df)
[1] "latin1" "latin1" "latin1"
> Encoding(df) <- "UTF-8"
> Encoding(df)
[1] "UTF-8" "UTF-8" "UTF-8"
> pander::pander(df)
----------------------
á é ç
------- ---- ---- ----
**ã** <e1> <e9> <e7>
----------------------
Trying enc2native()
makes no effect (work around issue #280 ).
Below the session info.
> devtools::session_info()
Session info -----------------------------------------------------------------------------
setting value
version R version 3.4.0 (2017-04-21)
system x86_64, mingw32
ui RStudio (1.0.143)
language (EN)
collate Portuguese_Brazil.1252
tz America/Sao_Paulo
date 2017-07-03
Packages ---------------------------------------------------------------------------------
package * version date source
backports 1.1.0 2017-05-22 CRAN (R 3.4.0)
base * 3.4.0 2017-04-21 local
compiler 3.4.0 2017-04-21 local
datasets * 3.4.0 2017-04-21 local
devtools 1.13.2 2017-06-02 CRAN (R 3.4.1)
digest 0.6.12 2017-01-27 CRAN (R 3.4.0)
evaluate 0.10.1 2017-06-24 CRAN (R 3.4.0)
graphics * 3.4.0 2017-04-21 local
grDevices * 3.4.0 2017-04-21 local
htmltools 0.3.6 2017-04-28 CRAN (R 3.4.0)
knitr 1.16 2017-05-18 CRAN (R 3.4.0)
magrittr 1.5 2014-11-22 CRAN (R 3.4.0)
memoise 1.1.0 2017-04-21 CRAN (R 3.4.1)
methods * 3.4.0 2017-04-21 local
pander 0.6.0 2015-11-23 CRAN (R 3.4.0)
Rcpp 0.12.11 2017-05-22 CRAN (R 3.4.0)
rmarkdown 1.6 2017-06-15 CRAN (R 3.4.0)
rprojroot 1.2 2017-01-16 CRAN (R 3.4.0)
rstudioapi 0.6 2016-06-27 CRAN (R 3.4.1)
stats * 3.4.0 2017-04-21 local
stringi 1.1.5 2017-04-07 CRAN (R 3.4.0)
stringr 1.2.0 2017-02-18 CRAN (R 3.4.0)
tools 3.4.0 2017-04-21 local
utils * 3.4.0 2017-04-21 local
withr 1.0.2 2016-06-20 CRAN (R 3.4.1)
yaml 2.1.14 2016-11-12 CRAN (R 3.4.0)
I created a data.frame df
in the Lithuanian locale. The object df
:
df
# vidurkis PI_apatine_riba PI_virsutine_riba n
# Pagal „z“ formulę 54.9 52.4 57.3 24
# Pagal „t“ formulę 54.9 52.3 57.5 24
And run pander(df)
:
library(pander)
debugonce(pandoc.table.return)
Sys.setlocale(locale = "Lithuanian")
df <- readRDS("df.Rds")
pander(df)
Before code breaking in lines 582-583, object t
was created. I saved that object as
"t.Rds".
# lines 582-583 in `pandoc.table.return` where the error occurs:
res <- paste0(res, paste(apply(t, 1, function(x) paste0(table.expand(x,
t.width, justify, sep.col), sep.row)), collapse = "\n"))
Other code needed to run these lines:
# Function, defined inside `pander::pandoc.table.return`
table.expand <- function(cells, cols.width, justify, sep.cols) {
.Call("pander_tableExpand_cpp", PACKAGE = "pander",
cells, cols.width, justify, sep.cols, style)
}
# Parameters before calling `table.expand`
t.width <- c(23, 10, 17, 19, 4)
justify <- c("centre", "centre", "centre", "centre", "centre")
sep.col <- c("", " ", "" )
style <- "multiline"
After leaving the debugging mode, I loaded the first row of "t.Rds" and created the analogous line as a character vector t0
.
t <- readRDS("t.Rds")[1, ]
the_names <- names(t)
# The contents of `t0` are same contents as in `t`
t0 <- c("**Pagal „z“ formulę**", "54.9", "52.4", "57.3", "24")
names(t0) <- the_names
print(t0)
# t.rownames vidurkis PI_apatine_riba PI_virsutine_riba n
# "**Pagal „z“ formulę**" "54.9" "52.4" "57.3" "24"
print(t)
# t.rownames vidurkis PI_apatine_riba PI_virsutine_riba n
# "**Pagal „z“ formulę**" "54.9" "52.4" "57.3" "24"
sapply(t0, Encoding)
# t.rownames vidurkis PI_apatine_riba PI_virsutine_riba n
# "unknown" "unknown" "unknown" "unknown" "unknown"
sapply(t, Encoding)
# t.rownames vidurkis PI_apatine_riba PI_virsutine_riba n
# "UTF-8" "unknown" "unknown" "unknown" "unknown"
table.expand(t0, t.width, justify, sep.col)
# [1] " **Pagal „z“ formulę** 54.9 52.4 57.3 24 "
table.expand(t, t.width, justify, sep.col)
## Error in table.expand(t, t.width, justify, sep.col) :
## basic_string::_S_create
table.expand
wokrs fine with t0
and breaks with t
. The only difference between these two objects is the encoding of the element t.rownames
. Therefore it seems that "UTF-8" causes the error.
Any ideas why does the encoding change and cause this problem and how it's possible to fix it?
p.s. This comment could be the "more information" for #280
Objects t and df.zip
Thanks a lot for the detailed info, really helpful!
@RomanTsegelskyi, would you have a chance to look into this?
Sorry I missed all the notifications before, I will try to look into this
Is there any news on this issue?
I've been exploring this issue and have found that the culprit is tableExpand_cpp
Thanks, @lselzer! @RomanTsegelskyi, any chance you might be able to look into this?
using enc2native inside table.expand
fixes these issue, though I don't know how robust is this solution. I only know so little about character encoding and I don't know how this will work with other languages like chinese.
I can make a PR if you are willing to accept it.
@lselzer , on your computer, does it solve the original issue of this thread #296? I installed pander
from your repository, but when I use the Lithuanian locale and the provided example, there is no effect on my PC (code results in the same error).
Yes, it solves the issue. I tried your code, tried to reproduce your error but I couldn't
devtools::session_info()
Session info ------------------------------------------------------------------------------------------------
setting value
version R version 3.4.3 (2017-11-30)
system x86_64, mingw32
ui RStudio (1.1.383)
language (EN)
collate Lithuanian_Lithuania.1257
tz America/Buenos_Aires
date 2018-03-12
Packages ----------------------------------------------------------------------------------------------------
package * version date source
base * 3.4.3 2017-11-30 local
compiler 3.4.3 2017-11-30 local
datasets * 3.4.3 2017-11-30 local
devtools 1.13.4 2017-11-09 CRAN (R 3.4.2)
digest 0.6.15 2018-02-12 Github (eddelbuettel/digest@d9f40a9)
graphics * 3.4.3 2017-11-30 local
grDevices * 3.4.3 2017-11-30 local
memoise 1.1.0 2017-04-21 CRAN (R 3.4.0)
methods * 3.4.3 2017-11-30 local
pander 0.6.1 2018-02-14 local
Rcpp 0.12.15.1 2018-02-14 Github (RcppCore/Rcpp@15b3a87)
stats * 3.4.3 2017-11-30 local
tools 3.4.3 2017-11-30 local
utils * 3.4.3 2017-11-30 local
withr 2.1.1.9000 2017-12-22 Github (jimhester/withr@df18523)
yaml 2.1.14 2016-11-12 CRAN (R 3.4.0)
sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=Lithuanian_Lithuania.1257 LC_CTYPE=Lithuanian_Lithuania.1257
[3] LC_MONETARY=Lithuanian_Lithuania.1257 LC_NUMERIC=C
[5] LC_TIME=Lithuanian_Lithuania.1257
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.4.3 tools_3.4.3 withr_2.1.1.9000 rstudioapi_0.7 yaml_2.1.14 memoise_1.1.0
[7] Rcpp_0.12.15.1 pander_0.6.1 digest_0.6.15 devtools_1.13.4
Similar problems on a German Locale since switching from R3.3 to R3.4 (Windows). I just tried with R3.5, but that didn’t change anything. Seems as if things get encoded wrongly in the rownames if German characters (e.g. “Ä”) are present there, in the colnames if present there, and interestingly only in the rownames if present in rownames and colnames. I downloaded https://github.com/lselzer/pander/archive/06c2f6579740564063af7081373113daa62b1023.zip and tried to install it, but unfortunately couldn’t get it to work, so don’t know if this would change things. It would be great if a solution to this problem could be found. Here’s an example:
library(pander) x <- data.frame(hö = c("ä", "o", "ü")) row.names(x) <- c("A", "Ä", "C") x hö A ä Ä o C ü pander(x)
hö
A ä
Ä o
Hi there,
I am also experiencing encoding issues on Windows with R 3.5.1 and Pander 0.6.2.
I have been trying to insert unicode for no-break spaces to indent factor levels in my tables. Here are three examples of what I am trying to do. Example 1 uses a normal space that is ignored by Pander(); examples 2 and 3 use the unicode "\u00A0" which appears as  instead of a space.
#using a space (ignored by pander)
example<-rbind("Meals in a Typical Day", " 1", " 2", " 3", " 4 or more")
example<-cbind(example, counts=c("","5","10","25","20"))
example
pander(example)
#using unicode for no-break space
example2<-rbind("Meals in a Typical Day", "\u00A01", "\u00A02", "\u00A03", "\u00A04 or more")
example2<-cbind(example2, counts=c("","5","10","25","20"))
example2
pander(example2)
#using unicode for no-break space
example3<-rbind("Meals in a Typical Day", "\u00A0\u00A01", "\u00A0\u00A02", "\u00A0\u00A03", "\u00A0\u00A04 or more")
example3<-cbind(example3, counts=c("","5","10","25","20"))
example3
pander(example3)
sessionInfo()
R version 3.5.1 (2018-07-02) Platform: i386-w64-mingw32/i386 (32-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages: [1] grid stats graphics grDevices utils datasets methods base
other attached packages: [1] bindrcpp_0.2.2 lubridate_1.7.4 forcats_0.3.0 stringr_1.3.1 dplyr_0.7.6 purrr_0.2.5
[7] readr_1.1.1 tidyr_0.8.1 tibble_1.4.2 ggplot2_3.0.0 tidyverse_1.2.1 VIM_4.7.0
[13] data.table_1.11.4 colorspace_1.3-2 pander_0.6.2 xtable_1.8-2 knitr_1.20 descr_1.1.4loaded via a namespace (and not attached): [1] Rcpp_0.12.18 xml2_1.2.0 bindr_0.1.1 magrittr_1.5 MASS_7.3-50 hms_0.4.2
[7] rvest_0.3.2 tidyselect_0.2.4 lattice_0.20-35 R6_2.2.2 rlang_0.2.1 broom_0.5.0
[13] laeken_0.4.6 rio_0.5.10 e1071_1.6-8 withr_2.1.2 modelr_0.1.2 class_7.3-14
[19] lmtest_0.9-36 assertthat_0.2.0 abind_1.4-5 digest_0.6.15 curl_3.2 haven_1.1.2
[25] sp_1.3-1 compiler_3.5.1 DEoptimR_1.0-8 cellranger_1.1.0 pillar_1.3.0 scales_0.5.0
[31] backports_1.1.2 boot_1.3-20 jsonlite_1.5 pkgconfig_2.0.1 rstudioapi_0.7 munsell_0.5.0
[37] carData_3.0-1 httr_1.3.1 plyr_1.8.4 car_3.0-0 tools_3.5.1 nnet_7.3-12
[43] vcd_1.4-4 nlme_3.1-137 gtable_0.2.0 cli_1.0.0 readxl_1.1.0 yaml_2.2.0
[49] lazyeval_0.2.1 crayon_1.3.4 zip_1.0.0 glue_1.3.0 robustbase_0.93-1.1 openxlsx_4.1.0
[55] stringi_1.1.7 foreign_0.8-70 zoo_1.8-3
I tested #326 in a Windows VM started and seems to do the trick, but please confirm.
Thanks; I downloaded and installed "pander-table-expand-fallback.zip" today. Unfortunately, for me the result is the same as before.
sessionInfo() R version 3.5.1 (2018-07-02) Platform: i386-w64-mingw32/i386 (32-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 Matrix products: default locale: [1] LC_COLLATE=German_Austria.1252 LC_CTYPE=German_Austria.1252 [3] LC_MONETARY=German_Austria.1252 LC_NUMERIC=C [5] LC_TIME=German_Austria.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] pander_0.6.2
loaded via a namespace (and not attached): [1] compiler_3.5.1 tools_3.5.1 Rcpp_0.12.17 digest_0.6.15
Hello, I have been successful with the first commit intended to solve this issue: install_github("Rapporter/pander@06c2f6579740564063af7081373113daa62b1023") but not with the latest one: install_github("Rapporter/pander@66492997bbdc4f9766d7c4573e676fbdb9bd7def")
R version 3.5.1 (2018-07-02) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 17134)
locale:
[1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 LC_NUMERIC=C
[5] LC_TIME=French_France.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
I tried on my home computer, and can also confirm success with this: install_github("Rapporter/pander@06c2f65") (not able to test it in the office, as I am not allowed to install packages from github there)
I had an issue trying to print a data frame with cyrillic column names:
Error in table.expand(t.colnames, t.width, justify, sep.col) :
basic_string::_S_create
Installing the patch mentionned in the above comment resolved the issue. (Whereas using colnames(x) <- enc2native(colnames(x))
before the call to pander()
didn't help).
I can also confirm that install_github("Rapporter/pander@06c2f65")
solved my original issue (I use Windows 10 and R 3.5.2).
@daroczig, will this patch be merged into the main branch of pander
? When can one expect it on CRAN?
And maybe #326 is not necessary?
I had the same issue with German which is solved by devtools::install_github("Rapporter/pander@06c2f65")
. Please merge the fix into the master release. Thank you!
@daroczig Do you plan on merging this issue? If not, pls let me know... I am holding off pushing an update of summarytools to CRAN (which will include translations) until the issue is resolved. Thx!
Sorry for the delay, getting this done today.
@daroczig It seems that currently CRAN version of pander
is inferior to the GitHub version.
When is the GitHub version of pander
(with this encoding bug fixed) going to be released on CRAN?
Is pander
going to be updated on CRAN?
I had an issue trying to print a data frame with cyrillic column names:
Error in table.expand(t.colnames, t.width, justify, sep.col) : basic_string::_S_create
Installing the patch mentionned in the above comment resolved the issue. (Whereas using
colnames(x) <- enc2native(colnames(x))
before the call topander()
didn't help).
I've similar issue but the comment didn't resolved the new issue
Error in table.expand(x, t.width, justify, sep.col) :
basic_string::_M_create
pander
has been updated on CRAN on 2021-06-13, so the CRAN version should include this fix. If you see any similar problems, please open a new ticket with a minimal reproducible example.
While using
pander
andR
version 3.4.0 I faced either an error or an encoding issue:Error message:
Result:
The problem disappears when I switch back to
R
3.3.3.Is there a way to overcome this bug without switching back to previous versions of
R
?