gitter-lab / SINGE

Gene regulatory network reconstruction from pseudotemporal single-cell gene expression data
MIT License
11 stars 6 forks source link

Support input mat files from Octave #59

Closed agitter closed 3 years ago

agitter commented 4 years ago

Copying the original report from https://github.com/Murali-group/Beeline/issues/33#issuecomment-674139800:

At this line https://github.com/gitter-lab/SINGE/blob/master/code/iLasso_for_SINGE.m#L56 , values of m.fullKp are written into a copy of the input mat file before a variable declaration for fullKp, and MATLAB should infer that fullKp is a struct array. This works as expected when using version 7.3 input mat files written by MATLAB, which support partial writes of each value without loading fullKp into memory https://www.mathworks.com/help/matlab/import_export/load-parts-of-variables-from-mat-files.html . As I understand it, the issue I ran into when using a version 7 input mat file written by Octave, which does not support writing version 7.3 mat files, is that MATLAB first loads the fullKp variable into memory, and because that variable hasn't been declared yet MATLAB initializes it, but at that point MATLAB fails to correctly infer that the type should be a struct array. The actual error I get is that the Kp2 struct cannot be assigned to a value in m.fullKp, that full error message copied below:

Warning: The file '/usr/local/SINGE/TempMat_5.mat' was saved in a format that does not support partial loading. Temporarily loading variable 'fullKp' into memory. To use partial loading efficiently, save MAT-files with the -v7.3 flag.

Error using iLasso_for_SINGE (line 56)
Conversion to double from struct is not possible.

The workaround I implemented in that commit was to initialize the fullKp variable as a struct array in the input mat file, which appears to work but obviously it is not ideal to initialize a specific internal variable that could change in another SINGE version. I think this might be addressed in SINGE to support a version 7 mat file written by Octave by either initializing fullKp as a struct array before that write, loading and saving the input file as a version 7.3 mat file instead of directly copying to a temporary file when parsing the input, or otherwise writing that variable somewhere other than the mat file.

agitter commented 4 years ago

I was able to reproduce this error. I loaded data1/X_SCODE_data.mat into MATLAB and saved it in v7 format using save('X_SCODE_data_v7.mat', '-v7'). Running SINGE v0.4.1 gives the same error reported above:

Warning: The file '/SINGE/TempMat_0.mat' was saved in a format that does not support partial loading. Temporarily loading variable 'fullKp' into memory. To use partial loading efficiently, save MAT-files with the -v7.3 flag.
> In matlab.io.MatFile/inefficientPartialSave (line 157)
  In matlab.io.MatFile/subsasgn (line 527)
  In iLasso_for_SINGE (line 56)
  In run_iLasso_row (line 27)
  In SINGE_GLG_Test (line 62)
Error using iLasso_for_SINGE (line 56)
Conversion to double from struct is not possible.

Error in run_iLasso_row (line 27)

Error in SINGE_GLG_Test (line 62)

MATLAB:invalidConversion

Here is the input file we can use for testing: X_SCODE_data_v7.zip It's in a zip format because GitHub doesn't support .mat attachments.