1) hdf5 is good for fast sequential access.
2) memmap is basically as fast as hdf5 (and has better support for thread-safe simultaneous writing) but isn't as generally-used as hdf5 (which is language independent) and which might not be able to handle all the datatypes (e.g. json)
3) databases: Basically we want a random-access database that is fast. So, possibly mongo might be good enough for this. Or of course, there's Postgres. There are also
-- leveldb
-- lmdb
Basically we want a fast accessible database with random access to prevent copying datasets when generating permutations.
1) hdf5 is good for fast sequential access.
2) memmap is basically as fast as hdf5 (and has better support for thread-safe simultaneous writing) but isn't as generally-used as hdf5 (which is language independent) and which might not be able to handle all the datatypes (e.g. json) 3) databases: Basically we want a random-access database that is fast. So, possibly mongo might be good enough for this. Or of course, there's Postgres. There are also -- leveldb -- lmdb
Basically we want a fast accessible database with random access to prevent copying datasets when generating permutations.