tbeu / matio

MATLAB MAT File I/O Library
https://matio.sourceforge.io
BSD 2-Clause "Simplified" License
342 stars 96 forks source link

Segfault in Read5 #40

Closed vadimkantorov closed 8 years ago

vadimkantorov commented 8 years ago

Hi, I'm reading a MatLab file with Torch wrapper for matio and getting a Segfault at https://github.com/tbeu/matio/blob/master/src/mat5.c#L4331.

Program received signal SIGSEGV, Segmentation fault.
Read5 (mat=0x722ef0, matvar=<optimized out>) at mat5.c:4331
4331                    fields[i]->internal->fp = mat;
(gdb) bt
#0  Read5 (mat=0x722ef0, matvar=<optimized out>) at mat5.c:4331
#1  0x00007fffa425d3a0 in Read5 (mat=0x722ef0, matvar=<optimized out>) at mat5.c:4332
#2  0x00007fffa425d3a0 in Read5 (mat=0x722ef0, matvar=<optimized out>) at mat5.c:4332
#3  0x00007fffa4268714 in Mat_VarRead (mat=0x722ef0, name=<optimized out>) at mat.c:1987
#4  0x000000000048ce29 in lj_vm_ffi_call ()
#5  0x000000000045d4e0 in lj_ccall_func ()
#6  0x000000000045f7b6 in lj_cf_ffi_meta___call ()
#7  0x000000000048ae6a in lj_BC_FUNCC ()
#8  0x000000000047a6dd in lua_pcall ()
#9  0x000000000041131f in pmain ()
#10 0x000000000048ae6a in lj_BC_FUNCC ()
#11 0x000000000047a757 in lua_cpcall ()
#12 0x000000000040f234 in main ()

The file I'm loading is a few hundred megabytes, but I could provide it if needed. Thanks!

tbeu commented 8 years ago

Was the MAT file created using MATLAB? Which version of matio did you use? In order to reproduce and debug the problem I'd need the MAT file causing the segfault. It would be nice if you could reduce its size to a minimum. Thanks for your report.

tbeu commented 8 years ago

Any file to share?

vadimkantorov commented 8 years ago

Sorry for the delay. You could get the file (350 Mb) from my OneDrive: https://1drv.ms/u/s!Apx8USiTtrYmoJRncwpfATtnCZLvuA

Yes, the file is created from MATLAB, I tried reinstalling the latest matio, but same error.

tbeu commented 8 years ago

Thanks. Can reproduce and will investigate it. Do you have some MCOS objects inside the struct? It looks like it when watching the binary stream but I did not find it when opening it in MATLAB R14SP3.

vadimkantorov commented 8 years ago

Don't know what MCOS objects are. I produced this file running R2015a with code from this repo: https://github.com/hbilen/WSDDN/blob/master/scripts/prepare_wsddn.m#L18

The produced saved model is fine if I run this code as is and fails if I remove drop6 and drop7 from the list (the objects from the list get saved in the produced file in the end).

That's the code for producing this part of the object to be saved: https://github.com/vlfeat/matconvnet/blob/91399d47fcfcd06836f30ee3d88fcc7116ae40e0/matlab/%2Bdagnn/%40DagNN/saveobj.m

tbeu commented 8 years ago

Yes, that is exactely what I mean by MCOS (MATLAB Class Object System). These class objects are currently not supported.

vadimkantorov commented 8 years ago

In my understanding is that matconvnet's code converts the DagNN object to plain struct before saving: https://github.com/vlfeat/matconvnet/blob/91399d47fcfcd06836f30ee3d88fcc7116ae40e0/matlab/%2Bdagnn/%40DagNN/saveobj.m#L3

matio reads the saved file OK when it's saved with '-v6'.

tbeu commented 8 years ago

But

$ ./matdump -f whos bug_matiov6.mat
Name                       Size           Bytes          Class

net                        1x1           515410578  mxSTRUCT_CLASS
stats                      0x0                   0  mxDOUBLE_CLASS
                           1x88                 88  mxUINT8_CLASS

gives some strange trailing 88 bytes variable for the MCOS.

tbeu commented 8 years ago

Anyway, I will investigate further. Done this a few times previously. It is a tedious byte-by-byte comparison of the zlib-inflated v7 and the v6 MAT-file which usually gives the hint where matio uncompressing is faulty.

vadimkantorov commented 8 years ago

Would you like me to upload the working v6 file? (though I'm not sure it would be bitwise equivalent, the weights in the model could be slightly different)

tbeu commented 8 years ago

Thanks. I already created it myself using old MATLAB R14SP3.

load bug_matio.mat
save bug_matiov6.mat net stats -v6
tbeu commented 8 years ago

The/one error is reproducable if only net.layers is saved. Reduces the file size significantly.

vadimkantorov commented 8 years ago

I guess this loop is to blame: https://github.com/vlfeat/matconvnet/blob/91399d47fcfcd06836f30ee3d88fcc7116ae40e0/matlab/%2Bdagnn/%40DagNN/saveobj.m#L40

Yet it goes fine with -v6...

tbeu commented 8 years ago

Oops, there is class(block). Not sure if this is supported. Does not look like plain vanilla.

vadimkantorov commented 8 years ago

"ClassName = class(object) returns a string specifying the class of object." (http://fr.mathworks.com/help/matlab/ref/class.html)

And it's indeed a byte vector when I read the file saved with -v6...

tbeu commented 8 years ago

Right. Nothing to blame.

tbeu commented 8 years ago

net.layers(1,22).block.levels is strange. What type is it?

vadimkantorov commented 8 years ago

It should be just an integer.

2016-07-11 21:36 GMT+02:00 tbeu notifications@github.com:

net.layers(1,22).block.levels is strange. What type is it?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tbeu/matio/issues/40#issuecomment-231840876, or mute the thread https://github.com/notifications/unsubscribe/AA_lWNB-fSrbRK77_Pnqu1WdhEkMK-zUks5qUptbgaJpZM4JHQwj .

Vadim Kantorov +33 6 03 29 27 69

tbeu commented 8 years ago

Well, it looks like an empty field.

vadimkantorov commented 8 years ago

When I load the buggy file, it's somehow a gpuArray:

class(a.net.layers(1, 22).block.levels)

ans =

gpuArray

tbeu commented 8 years ago

Yes, confirmed. Where is it saved (in m source)?

tbeu commented 8 years ago

Both the v6 and the v7 MAT-file have MCOS ... gpuArray in the binary stream there.

vadimkantorov commented 8 years ago

Yep, what's happening is that in the model there's a custom layer (a MCOS) that gets serialized at the block.save() call https://github.com/vlfeat/matconvnet/blob/91399d47fcfcd06836f30ee3d88fcc7116ae40e0/matlab/%2Bdagnn/%40DagNN/saveobj.m#L42

What's surprising is that if I keep drop6 and drop7 objects in the graph, then both v6 and v7 version produce a readable file.

tbeu commented 8 years ago

At least it should not crash on such data. Going to try to handle this exception.

vadimkantorov commented 8 years ago

Thanks!

By the way, can this github project be considered as the official matio home?

tbeu commented 8 years ago

This is my mirror of the official matio from sf.net. Once I file a new release I push to the sf.net repo and prepare the release there.

tbeu commented 8 years ago

You wanna test if 3b6dc8f works for you where the unknown MCOS class is skipped now.

vadimkantorov commented 8 years ago

I'll check it later tonight!

vadimkantorov commented 8 years ago

The file doesn't break anymore, LGTM.

tbeu commented 8 years ago

Since reading works I am going to close this issue and move the existing writing problems for #47.