Open pascalreinhold opened 1 year ago
Hi @pascalreinhold, thanks for opening the issue!
matio-cpp
is a cpp interface toward the matio
library, that takes care of dealing with the mat
file. When opening a mat
file, matio
loads its entire content in memory. Hence, when reading and writing variables, it always accesses the same portion of memory. Thus, there are possible concurrency issues, and by extension, also matio-cpp
is not thread-safe.
If your goal is to speed up the reading of the file, I would suggest splitting it in separate files, or to use a format that supports reading in chunks like hdf5
(see for example https://docs.hdfgroup.org/hdf5/v1_12/group___h5_d.html#gac1092a63b718ec949d6539590a914b60). Recent mat
files are compatible with hdf5
, but mat
files on their own do not support this option unfortunately.
Hey thank you for the fast reply.
Does this is also apply to me, because I'm just reading the file and not writing?
Hence, when reading and writing variables [...] there are possible concurrency issues, and by extension, also
matio-cpp
is not thread-safe.
Not sure, but I think you are mistaken. In matio
there are the Mat_VarReadInfo()
and Mat_VarRead()
functions to avoid loading a variable into memory until you need it.
When opening a 'mat' file, 'matio' loads its entire content in memory
Not sure, but I think you are mistaken. In
matio
there are theMat_VarReadInfo()
andMat_VarRead()
functions to avoid loading a variable into memory until you need it.When opening a 'mat' file, 'matio' loads its entire content in memory
Both those function require opening the mat
file first, i.e. loading it into memory. See:
Btw, Mat_VarRead
is the exact function that matio-cpp
uses to read a variable: https://github.com/ami-iit/matio-cpp/blob/a0daf0691d492b2ed50910ea984f97bc2f945b80/src/File.cpp#L322
Note that Mat_VarRead
requires a non-const pointer to a mat_t
object. This means that even the read can potentially modify this object. Hence, there could be possible concurrent reads and writes. So to answer your question,
Does this is also apply to me, because I'm just reading the file and not writing?
unfortunately, yes.
Hello there,
is it possible to read multiple large structs (1-3 GB) from a MAT-file concurrently? I found nothing on this github page regarding thread safety.
If it is not supported out of the box then how would one go about it?