Open farhi opened 4 years ago
Combining with MappedTensor:
can be great :+1:
What could be done:
tempname
is used in function create_temp_file
:1493. One could add an input argument option 'Dir' that could be /tmp
(default), or /dev/shm
to storing data in memory. This last choice is only relevant when coupled with in-memory compression option below. Working in /dev/shm
without compression is just equivalent to normal Matlab memory management, in a very complex way :-1: .zmat
(https://github.com/fangq/zmat) does. This would allow smaller files, or, when stored in /dev/shm
working with in-memory compressed data. Only LZ4
should be supported though for performance considerations.
It could be efficient to use either lazy loading (see #193), or in memory compression with a fast compressor, such as:
This latter works with Matlab 2017. An adaption to old MeX functions may be needed for old Matlab versions (e.g. 2010a).
A quick test:
Results are highly dependent on the initial data. Here we use a matrix with mostly zeros. Sparse storage would be a good solution as well.
With random data, compression ration is very bad (around 1). With organised data (for instance
magic
), it is pretty good. In all cases, usinglz4
compressor is the fastest, by far.This could be embedded into estruct/findfield. Its cached data can be used to identify large blocks, and then compress them dynamically, as an alias, or a new compressed object, that should implement basic methods (subsref, subsasgn, ...).