Open madlogos opened 6 years ago
When trying to operate some doc files with non-ASCII characters (e.g., Chinese characters)...
wdApp <- COMCreate("Word.Application") doc <- wdApp[["Documents"]]$Open(file)
(there are Chinese characters in the full path of the object file)
file
RDCOMClient then returns with ...
RDCOMClient
<checkErrorInfo> 80020009 No support for InterfaceSupportsErrorInfo checkErrorInfo -2147352567 Error: Exception occurred.
And I found RDCOMClient does not support non-ASCII characters in quite many other scenarios.
Suppose I have a PowerPoint presentation with a TextFrame (title) on slide 1. I'm supposed to extract the text using
ppt <- COMCreate("Powerpoint.Application") pres <- ppt[["Presentations"]]$Open(pptfile) slide <- pres$Slides(1) tf <- slide$Shapes(1) print(tf$TextFrame()$TextRange$Text())
If the text is in English, the code above works well. But if there is any non-ASCII characters in it, I can simply get a "".
Could you please look into it? Many thanks.
sessionInfo:
R version 3.5.0 (2018-04-23) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=Chinese (Simplified)_People's Republic of China.936 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] aseshms_0.1-15 RDCOMClient_0.93-0 loaded via a namespace (and not attached): [1] tidyselect_0.2.4 slam_0.1-43 NLP_0.1-11 reshape2_1.4.3 purrr_0.2.4 haven_1.1.1 [7] ggthemes_3.5.0 rJava_0.9-9 XLConnect_0.2-15 tcltk_3.5.0 colorspace_1.3-2 htmltools_0.3.6 [13] yaml_2.1.19 rlang_0.2.0 recharts_0.2 pillar_1.2.2 foreign_0.8-70 glue_1.2.0 [19] withr_2.1.2 RColorBrewer_1.1-2 readxl_1.1.0 bindrcpp_0.2.2 foreach_1.4.4 plyr_1.8.4 [25] bindr_0.1.1 stringr_1.3.1 gWidgets2tcltk_1.0-5 cellranger_1.1.0 munsell_0.4.3 gtable_0.2.0 [31] htmlwidgets_1.2 devtools_1.13.5 codetools_0.2-15 memoise_1.1.0 forcats_0.3.0 rio_0.5.10 [37] knitr_1.20 extrafont_0.17 doParallel_1.0.11 tm_0.7-3 parallel_3.5.0 curl_3.2 [43] Rttf2pt1_1.3.6 gWidgets2_1.0-7 Rcpp_0.12.17 readr_1.1.1 XLConnectJars_0.2-15 scales_0.5.0 [49] DT_0.4 jsonlite_1.5 ggplot2_2.2.1 hms_0.4.2 digest_0.6.15 stringi_1.2.2 [55] openxlsx_4.0.17 dplyr_0.7.5 grid_3.5.0 tools_3.5.0 magrittr_1.5 lazyeval_0.2.1 [61] tibble_1.4.2 extrafontdb_1.0 pkgconfig_2.0.1 RODBC_1.3-15 data.table_1.11.2 xml2_1.2.0 [67] assertthat_0.2.0 iterators_1.0.9 R6_2.2.2 compiler_3.5.0
When trying to operate some doc files with non-ASCII characters (e.g., Chinese characters)...
(there are Chinese characters in the full path of the object
file
)RDCOMClient
then returns with ...And I found
RDCOMClient
does not support non-ASCII characters in quite many other scenarios.Suppose I have a PowerPoint presentation with a TextFrame (title) on slide 1. I'm supposed to extract the text using
If the text is in English, the code above works well. But if there is any non-ASCII characters in it, I can simply get a "".
Could you please look into it? Many thanks.
sessionInfo: