ardata-fr / mschart

:bar_chart: mschart: office charts from R
https://ardata-fr.github.io/mschart
Other
131 stars 25 forks source link

Error when exporting to Word charts with diacritical marks/accent marks on axis labels. #57

Closed MaxSerna closed 3 years ago

MaxSerna commented 3 years ago

I´m having an issue when trying to create bar charts where axis labels have accent marks. I first had the same issue when writing paragraphs with accented words, but I managed to solve it by just calling the enc2utf8() function, as is shown in the first example enc2utf8('Téstíng testing aé').

# devtools::install_github("davidgohel/officer")
# devtools::install_github("ardata-fr/mschart")
library(officer)
library(mschart)

# Does work ----------------------------------------------------

my_barchart <- ms_barchart(data = browser_data,
                           x = "browser", y = "value", group = "serie")
my_barchart <- chart_settings( my_barchart, grouping = "stacked",
                               overlap = 100)

doc <- read_docx()
doc <- body_add_chart(doc, chart = my_barchart)
doc <- body_add_par(doc, enc2utf8('Téstíng testing aé'))
print(doc, target = "barchart_example.docx")

This works fine. image

Unfortunately that doesn´t work with chart axis labels. In the following example, I added "Ópera" instead of "Opera" in the data, using enc2utf8() again. (Does not work without enc2utf8() either).

test <- browser_data
test[test$browser=='Opera', 1] <- enc2utf8('Ópera') # add an accent

> head(test)
  browser  serie value
1  Chrome serie1     1
2      IE serie1     2
3 Firefox serie1     3
4  Safari serie1     4
5   Ópera serie1     5
6 Android serie1     6

Then I run the same code. I just omitted tha paragraph and changed the data argument in the ms_barchart() function.

# Does not work ------------------------------------------------
my_barchart <- ms_barchart(data = test,
                           x = "browser", y = "value", group = "serie")
my_barchart <- chart_settings( my_barchart, grouping = "stacked",
                               overlap = 100 )

doc <- read_docx()
doc <- body_add_chart(doc, chart = my_barchart)
print(doc, target = "barchart_example2.docx")

But I get this error when I try to open the file. git issue

"We're sorry. We can't open XXX.docx because we found a problem with its contents", and no details available.

This is my session info

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=Spanish_Mexico.1252  LC_CTYPE=Spanish_Mexico.1252    LC_MONETARY=Spanish_Mexico.1252 LC_NUMERIC=C                   
[5] LC_TIME=Spanish_Mexico.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] mschart_0.2.5      officer_0.3.15.005

loaded via a namespace (and not attached):
 [1] compiler_4.0.2    R6_2.4.1          magrittr_1.5      htmltools_0.5.0   tools_4.0.2       cellranger_1.1.0  uuid_0.1-4       
 [8] xml2_1.3.2        writexl_1.3.1     data.table_1.13.0 digest_0.6.25     zip_2.1.1         rlang_0.4.8  

I did check the issues sections to find a solution, which I did for body_add_par(), but I wasn´t able to find anything regarding charts. Thanks in advance!

MaxSerna commented 3 years ago

Note:

I can open the created chart by calling print() with the preview argument, and check that the word "Ópera" is there. I just cannot export it to Word.

print(my_barchart, preview = TRUE)

image

sda030 commented 3 years ago

Same issue applies for other non-ASCII characters such as the nordic letters æøåÆØÅ.

sda030 commented 3 years ago

My dirty post hoc fix solution to this problem in case it helps anyone.

print2.rdocx <- function(x, target=file.path(getwd(), "tmp.docx")) {
    # Unzip everything in temp, recode chart files to UTF-8 and return
    current_wd <- getwd()
    dir.create(tmp_zip_dir <- tempfile())
    tmp_zip <- tempfile(fileext = ".docx")
    print(x, target = tmp_zip)

    out <- utils::unzip(zipfile = tmp_zip, exdir = tmp_zip_dir)
    out <- grep(pattern = "charts\\/.*\\.xml$", x = out, value = TRUE)
    lapply(out, function(xml_file) {
            xml_content <- readr::read_file(xml_file)
            xml_content <- iconv(x = xml_content, from = "latin1", to = "UTF-8")
            readr::write_file(x = xml_content, file = xml_file, append = FALSE)
        })

    #### Replace with officer::pack_folder(folder=tmp_zip_dir, target=file.path(path, paste0(file_prefix, ".docx")))
    setwd(tmp_zip_dir)
    out <- list.files(path = tmp_zip_dir, all.files = TRUE, recursive = TRUE, include.dirs = FALSE)
    utils::zip(files=out, zipfile = tmp_zip)
    setwd(current_wd)
    file.copy(from = tmp_zip, to = target, overwrite = TRUE, copy.date = TRUE)
    target
}

Created on 2021-04-09 by the reprex package (v2.0.0)

davidgohel commented 3 years ago

I am working on the package - there are things to fix and improve first. I will then fix that issue.

davidgohel commented 3 years ago

This is fixed now. Thanks for reporting this issue