HenrikBengtsson / R.matlab

R package: R.matlab
https://cran.r-project.org/package=R.matlab
86 stars 25 forks source link

writeMat / readMat: dot <-> underscore conversion is not bidirectional #36

Closed epiasini closed 8 years ago

epiasini commented 8 years ago

readMat converts automatically any underscore contained in variable names to dots, which is very helpful and convenient. It would be great if this could also happen the other way around, when using writeMat.

HenrikBengtsson commented 8 years ago

You're asking for R name foo.bar to MATLAB name foo_bar, correct?

Interestingly, MATLAB can read variables with dots / periods in their names. For example, in R:

> R.matlab::writeMat("foo.mat", a.b=1, c_d=2)

and in then in MATLAB (7.14.0.739):

>> load foo                    
>> who

Your variables are:

a.b  c_d  

but I'm actually not sure how to access variable a.b in MATLAB (haven't used much MATLAB the last decade or so), e.g. we get:

>> a.b 
Undefined variable "a" or function "a.b".
>> isvarname('a.b')

ans =

     0

>> isvarname('a_b')

ans =

     1

Thus, I guess writeMat() shouldn't really generate such variable names (unless they're actually valid). If so, adding a fixNames=TRUE argument might be reasonable.

BTW, readMat(..., fixNames=FALSE) prevents the renaming from underscores to dots, e.g.

> writeMat("foo.mat", a.b=1, c_d=2)
> readMat("foo.mat") ## Defaults to fixNames=TRUE
$a.b
[1] 1

$c.d
[1] 2

> readMat("foo.mat", fixNames=FALSE)
$a.b
[1] 1

$c_d
[1] 2

The reason for fixNames=TRUE being the default is that in the past a_1 was equivalent to a <- 1 (back in the terminal days). However, since those days the R parser has been updated to allow for underscore in variable names, i.e. a_1 <- 2 is now perfectly valid and does what we expect. Because of this, maybe the new default for readMat() should fixNames=FALSE.

epiasini commented 8 years ago

Thanks for the reply! yes, that was what I meant. The period is still not included in the characters allowed in variable names in MATLAB, so yes, I suppose having a fixNames argument in writeMat could make sense.

Regarding the defaults, whatever makes more sense to you. Personally I'd be partial to keeping the existing behaviour in readMat and mirroring it in writeMat, on the grounds that variable names with periods are omnipresent in R (and even recommended.. this is you, right?) but are not supported in MATLAB, so this would suggest defaulting fixNames=TRUE in writeMat, and in turn defaulting to the same value in readMat for symmetry. But I can also see how one can argue that by default the library shouldn't meddle with the variable names chosen by the user..

Anyway, thanks a lot again!

HenrikBengtsson commented 8 years ago

Please try with:

source('http://callr.org/install#HenrikBengtsson/R.matlab@develop')

I've added writeMat(..., fixNames=TRUE). This (new) default behavior makes sense since periods are invalid on the MATLAB side for variable names (and I assume also for field names). I won't worry about changing the current default for readMat(..., fixNames=TRUE) for now, although it could be FALSE these days.

Also, it's of course not possible to make mixed cases truly bijective, e.g. a_b.c. That will end up as either a_b_c or a.b.c for a writeMat() -> readMat() sequence.