NeurodataWithoutBorders / matnwb

A Matlab interface for reading and writing NWB files
BSD 2-Clause "Simplified" License
49 stars 32 forks source link

iteratively write large dataset #145

Closed bendichter closed 4 years ago

bendichter commented 5 years ago

Is it possible to write a very large block of data that does not fit in memory all at once?

lawrence-mbf commented 5 years ago

No, this is a similar issue to #109 regarding editing files. The issue with resolving this is similar: https://github.com/NeurodataWithoutBorders/matnwb/issues/109#issuecomment-461835994

MATLAB doesn't quite have the same language support for streaming data from a file or a group of files automatically, but we can do direct dataset writing which I hope will alleviate the issue somewhat.

In order to add dataset modification behavior, we must first allow the ability to "pre-write" data to a file regardless of whether it exists. The functionality could be added to the DataStub, allowing the user to add and modify data based on index. On nwbExport() we would then leverage the default behavior of copying the data from the cached dataset (which I believe does not load the entire set into memory).

Good opportunity to also add in #50 and maybe chunking.

lawrence-mbf commented 4 years ago

With inclusion of #179 writing a large block of data iteratively should be possible. That said, the api could be a little better and the workflow can certainly be improved. I will close this ticket in lieu of the more accurate #184