kaneplusplus / bigmemory

126 stars 24 forks source link

portable matrix file #87

Open mikejiang opened 6 years ago

mikejiang commented 6 years ago

Currently it is saved as a filebacked matrix file and the descriptor file. I assume the matrix file is written by boost interprocess, which should be portable across language as long as the language binding for interprocess lib is available. (e.g. https://github.com/ESSS/pyboost_ipc). But the descriptor file is saved as rds file, will it be more portable to save it as plain text so that the on-disk matrix can be loaded into other non-R environment? Also, How easy (or difficult) is it to write an equivalent python package to interchange the memory-mapped matrix file from bigmemory?

privefl commented 6 years ago

Currently, the default is to write the descriptor as a plain text .desc file that basically contains the result of dput() of the descriptor. Don't know Python well, but that shouldn't be too hard to parse I guess.

mikejiang commented 6 years ago

Great. So how about the binary matrix file itself? Does it contain bigmemory-specific information or simply the generic boost interprocess format and thus readable by other platform independent from bigmemory R package?

privefl commented 6 years ago

The binary file containing the data is just nrow * ncol * sizeof(type) bytes.

mikejiang commented 6 years ago

Thanks for the information!