HenrikBengtsson / R.matlab

R package: R.matlab
https://cran.r-project.org/package=R.matlab
86 stars 25 forks source link

encoding error when get a string array using getVariable . #49

Open Blockhead-yj opened 3 years ago

Blockhead-yj commented 3 years ago

Hi, everyone! I encountered a problem when get a string array using getVariable function. Here is a example code . My platform is PCWIN64 ,my matlab version is R2020b. The error is "can only read in bytes in a non-UTF-8 MBCS locale". I have tested and found that this problem came up when the array is string array rather than char string, and since my matlab default encoding is 'GBK', I have tried changing it to 'UTF-8' by using slCharacterEncoding('UTF-8'), but it didn't work. I will appreciate it if anyone can help or give some advice!

library(R.matlab)

Matlab$startServer()
matlab <- Matlab()
open(matlab)
evaluate(matlab,'a=["adsafdsa";"bfdgadfg"];')
a <- getVariable(matlab,'a')
close(matlab)
Blockhead-yj commented 3 years ago

PS: the encoding format of Rstudio is 'UTF-8', though I also tried 'WINDOWS-1252' as my win10 encoding.

HenrikBengtsson commented 3 years ago

Hi, I have very little time to work on this package, but please provide what traceback() outputs immediately after you get the error, and also your sessionInfo(). This helps narrow down where in the code the problem lies.

Blockhead-yj commented 3 years ago

sorry, I made a mistake, it was not a error but a warning.

Warning messages: 1: In readChar(con = con, nchars = nbrOfBytes) : can only read in bytes in a non-UTF-8 MBCS locale 2: In readMat(filename) : strings not representable in native encoding will be translated to UTF-8

The problem is , the variable I get became a byte array. For example,

evaluate(matlab,'a=["a";"b"];')
a <- getVariable(matlab,'a')

a is just like

>a
$MCOS
[1] 73 74 72 69 6e 67

[[2]]
           [,1]
[1,] -587202560
[2,]          2
[3,]          1
[4,]          1
[5,]          1
[6,]          1

[[3]]
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
[1,]    0    1   73   77    0    0    0    0   14     0     0     0     8     3     0     0     6     0     0     0     8     0     0     0
     [,25] [,26] [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38] [,39] [,40] [,41] [,42] [,43] [,44] [,45] [,46] [,47]
[1,]     2     0     0     0     0     0     0     0     5     0     0     0     8     0     0     0     1     0     0     0     1     0     0
     [,48] [,49] [,50] [,51] [,52] [,53] [,54] [,55] [,56] [,57] [,58] [,59] [,60] [,61] [,62] [,63] [,64] [,65] [,66] [,67] [,68] [,69] [,70]
[1,]     0     1     0     0     0     0     0     0     0     5     0     4     0     5     0     0     0     1     0     0     0     5     0
     [,71] [,72] [,73] [,74] [,75] [,76] [,77] [,78] [,79] [,80] [,81] [,82] [,83] [,84] [,85] [,86] [,87] [,88] [,89] [,90] [,91] [,92] [,93]
...(total 936)

attr(,"header")
attr(,"header")$description
[1] "MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Tue Dec 29 11:30:02 2020                                        \b\001"

attr(,"header")$version
[1] "5"

attr(,"header")$endian
[1] "little"

But there is no problem if the variable is not string array but char array.

evaluate(matlab,"a=['a';'b'];")
a <- getVariable(matlab,'a')

you got

> evaluate(matlab,"a=['a';'b'];")
> a <- getVariable(matlab,'a')
Warning message:
In readChar(con = con, nchars = nbrOfBytes) :
  can only read in bytes in a non-UTF-8 MBCS locale
> a
$a
     [,1]
[1,] "a" 
[2,] "b" 

attr(,"header")
attr(,"header")$description
[1] "MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Tue Dec 29 11:34:24 2020                                        "

attr(,"header")$version
[1] "5"

attr(,"header")$endian
[1] "little"
HenrikBengtsson commented 3 years ago

Thanks. So, my MATLAB skills are super rusty - like from 2005-ish.

Let's focus on:

> evaluate(matlab,"a=['a';'b'];")
> a <- getVariable(matlab,'a')
Warning message:
In readChar(con = con, nchars = nbrOfBytes) :
  can only read in bytes in a non-UTF-8 MBCS locale

This warning comes from readMat() reading the results from MATLAB. It would be useful to have that as a MAT file. Can you create that a in MATLAB, and then save it in MAT v6 format, and make it available somewhere for download? Something like:

>> a=['a'; 'b']';
>>  save('issue49-a.mat', '-v6', 'a');
Blockhead-yj commented 3 years ago

issue49.zip

Blockhead-yj commented 3 years ago

Thank you for your kind replay!