tbeu / matio

MATLAB MAT File I/O Library
https://matio.sourceforge.io
BSD 2-Clause "Simplified" License
334 stars 97 forks source link

Overflow in Read5 #91

Closed papadop closed 6 years ago

papadop commented 6 years ago

I have been tracking an issue when reading a matlab matrix with matio. The matrix is a full_matrix of size 13446 by 45000. Here is the matvar content: gdb p *matvar $11 = {nbytes = 545592704, rank = 2, data_type = MAT_T_DOUBLE, data_size = 8, class_type = MAT_C_DOUBLE, isComplex = 0, isGlobal = 0, isLogical = 0, dims = 0x64cd20, name = 0x64cd40 "linop", data = 0x0, mem_conserve = 0, compression = MAT_COMPRESSION_ZLIB, internal = 0x643040} gdb> p matvar->dims[0]@2 $8 = {13446, 45000}

In Read5, we have : matvar->nbytes = lenmatvar->data_size; and len is computed as 1344645000.

The problem is that this number is bigger than 2^32 so it overflows, so that nbytes=545592704, which is 13446450008 modulo (2^32) instead of the proper value of 4840560000.

Silently having the overflow is bad. We should either emit an error or increase the size of the nbytes field. Note that matlab reads the matrix correctly, which may mean that such big matrices are allowed by matlab, which hints at correcting the bug by making nbytes a 64 bytes unsigned integer. If this is the proper correction, I can do it, but I'd first prefer getting an agreement that this is the proper solution....

papadop commented 6 years ago

Digging a little bit more, all sizes in matio are of type size_t which is correct. It is the locakl variable len whcih is an int. So the computation of len*matvar->data_size is made in 32 bits mode and then assigned to matvar->nbytes. Changing len to be of type size_t corrected the problem. I wonder whether nBytes should also be made of type size_t, but this is not mandatory for correcting my problem. See PR #92.