mrkrd / matlab_wrapper

Easy to use MATLAB wrapper for Python
GNU General Public License v3.0
78 stars 23 forks source link

Conversion of a struct #18

Closed andrescodas closed 6 years ago

andrescodas commented 7 years ago

After converting a struct a numpy.recarray is returned. The shape of this array (in python) is empty if the the struct size is 1x1 (in matlab). Should this struct be converted to a record to avoid the confusions from the arrays?. Otherwise, I believe this array should have dimension (1,)

Thanks!

mrkrd commented 7 years ago

After converting a struct a numpy.recarray is returned. The shape of this array (in python) is empty if the the struct size is 1x1 (in matlab). Should this struct be converted to a record to avoid the confusions from the arrays.

What do you mean by "record"? Do you mean "record array" in contrast to "structured array"?

[1] https://docs.scipy.org/doc/numpy/user/basics.rec.html#record-arrays

Otherwise, I believe this array should have dimension (1,)

I cannot test it now, but could you confirm that it's due to

        out = out.squeeze()

on L.663 in matlab_session.py? Does squeeze() function squeeze all the dimensions from the struct array?

If that's the case, honestly I would prefer not to use squeeze nowadays, but what's more important, I wouldn't like to break the compatibility. So I think squeeze() should stay.

Another option is to add an optional parameter to MatlabSession.get(): squeeze=True.

Would it help in your opinion?

andrescodas commented 7 years ago

What do you mean by "record"? Do you mean "record array" in contrast to "structured array"?

I refer to the type "numpy.record"

I cannot test it now, but could you confirm that it's due to out = out.squeeze()

Yes, if out.shape == (1,1), then after out.squeeze() we get out.shape = ()

Does squeeze() function squeeze all the dimensions from the struct array?

Yes, as the example above.

The issue with squeeze is that you miss the 'rountrip' property. Roughly speaking say you have a variable 'x' in your matlab session with size(x) == [1,3,1] then

x = matlab.get('x') matlab.put('y',x)

will make y != x.

I agree that returning a numpy.record raises an issue of backward compatibility, but consider this example

matlab.eval("s = struct('num',1.1,'log',true,'str','hello friend')")

with the current implementation you get

s.shape == () s.size == 1 s[0] (error!) s[()].str == 'hello friend'

My point is that having s.str working will be more intuitive, but not generally compatible. So, in my opinion the best would be to return an array with shape (1,) or shape (1,1), so the user can access the elements with s[0] or s[0,0]

mrkrd commented 7 years ago

Hi Andres,

Sorry for late answer. ;-)

The issue with squeeze is that you miss the 'rountrip' property. Roughly speaking say you have a variable 'x' in your matlab session with size(x) == [1,3,1] then

x = matlab.get('x') matlab.put('y',x)

will make y != x.

I completely agree with that and would do it the way you suggest nowadays.

I agree that this raises an issue of backward compatibility […]

The backward compatibility is one of the most important properties of libraries. Please think about people (that we don't know) that are using it and their surprise after updating matlab_wrapper. A library update should not break a programs like that.

(Rich Hickey's talk about growing and not breaking software is quite a good one: https://www.youtube.com/watch?v=oyLBGkS5ICk )

My suggestions:

  1. For the current matlab_wrapper (Python 2): add an additional keyword in MatlabSession constructor that will prevent from squeezing of the arrays, e.g., MatlabSession(squeeze=False). You would have to write the extra parameter only once in your script. I will happily integrate patches for that.

  2. For the future matlab_wrapper with a different name (matlab_wrapper3 for Python 3): go with your suggestion of not squeezing the arrays in the first place.

Cheers, Marek