blink1073 / oct2py

Run M Files from Python - GNU Octave to Python bridge
http://blink1073.github.io/oct2py/
MIT License
251 stars 52 forks source link

Serialize in memory #322

Open arnodelorme opened 3 weeks ago

arnodelorme commented 3 weeks ago

The fact that files have been loaded and saved every time Octave is called is a big limitation for doing anything of consequence with Oct2py.

I have found this method to serialize in memory

buffer = io.BytesIO()
scipy.io.savemat(buffer, data)
serialized_data = buffer.getvalue()
buffer.close()

Then the serialized_data could be read in Octave using (apparently Octave can read from strings)

load('-mat-binary', serialized_data);

Would that work?

blink1073 commented 3 weeks ago

Hi @arnodelorme, the reason we use files is that for large data we'd run into transmission issues over the pipe.

arnodelorme commented 3 weeks ago

Ah, OK, but the code above is more of a buffer than a pipe. It is subject to out-of-memory errors, though. Still, I think for large arrays or structures, it could speed read/write up to 1000 compared to storing on HDD (spinning disks).

blink1073 commented 3 weeks ago

You are welcome to explore it, I don't have the bandwidth anymore to do anything more than light maintenance on oct2py.

arnodelorme commented 3 weeks ago

OK, thank you for taking, at least, the time to respond. I might do some tests and issue a pull request (would just be an option, not the default, which works quite well).

Another issue I have encountered is that data access is different from accessing the HDF5 data directly. An Octave structure such as var.('test')(3).('test2') =1 (or var.test(3).test2 = 1), if I load the HDF5, directly, and convert mat structures to dictionaries, I would get something similar in Python

var['test'][2]['test2']

However, in Oct2py, I would need to use

var['test']['test2'][2,0]

Is there a way to use Oct2py with the original scipy.io.matlab._mio5_params.mat_struct object instead of the Struct Oct2py object? I am maintaining a large open-source Octave-compatible MATLAB project (EEG analysis +300k downloads) and would like Python users to be able to call the MATLAB functions. However, I would also like to have Python-optimized routines and Python users to access the data structures in a way that is natural in Python. So ideally, the Python data structures would not rely on external tools (only dictionaries, lists, and numpy arrays). They would look more like solution 1 above than solution 2.

BTW, of all of the methods I have tried to interface MATLAB with Python, Oct2py is the best. MATLAB code compilation into Python is pathetic (structure conversion is not supported, you can only pass on numerical arrays -- and the MATLAB runtime engine is a beast compared to the Octave one).

blink1073 commented 3 weeks ago

I might do some tests and issue a pull request (would just be an option, not the default, which works quite well).

Yes, I agree making it an option would be ideal.

Is there a way to use Oct2py with the original scipy.io.matlab._mio5_params.mat_struct object instead of the Struct Oct2py object?

This could be made an option as well I believe. The marshal/unmarshal logic was especially tricky, particularly around cells/structs.