A hdf5 file format standard will minimize the memory footprint and the processing time for RT-DC data sets. It allows to only load the data that is needed at the time (slicing of data on disk), which will result in a much more responsive user interface.
I propose the following workflow:
Setup a new repository for this file format
Define the strucure of the hdf5-based ".rtdc" file format.
which data is stored and how is it stored (video, traces, columns)
define a standard for metadata (name, unit, value)
the file format must be compatible with point 5 (real-time writing)
Write an .rtdc file writer and reader
Write a .tdms to .rtdc conversion utility in Python. This utility can be used in ShapeOut to convert .tdms data sets as a transitional solution.
Provide an .rtdc writer in C for real time data acquisition
Speed (e.g. pressing "Analyze" without waiting and not waiting anywhere else)
Drop the dependency on OpenCV (will fix isue #129)
Memory maps of videos/traces can simply be sliced for hierarchy children (will fix issue #100)
Only one file per measurement
Some refactoring required in dclab. In the end, we could drop support for tdms in dclab entirely and only rely on .rtdc, which would be cleaner.
A hdf5 file format standard will minimize the memory footprint and the processing time for RT-DC data sets. It allows to only load the data that is needed at the time (slicing of data on disk), which will result in a much more responsive user interface.
I propose the following workflow: