A hdf5 file format standard will minimize the memory footprint and the processing time for RT-DC data sets. It allows to only load the data that is needed at the time (slicing of data on disk), which will result in a much more responsive user interface.
I propose the following workflow:
Setup a new repository for this file format
Define the strucure of the hdf5-based ".rtdc" file format.
which data is stored and how is it stored (video, traces, columns)
define a standard for metadata (name, unit, value)
the file format must be compatible with point 5 (real-time writing)
Write an .rtdc file writer and reader
Write a .tdms to .rtdc conversion utility in Python. This utility can be used in ShapeOut to convert .tdms data sets as a transitional solution.
Provide an .rtdc writer in C for real time data acquisition
Advantages:
Speed (e.g. pressing "Analyze" without waiting and not waiting anywhere else)
Drop the dependency on OpenCV (will fix isue #129)
Memory maps of videos/traces can simply be sliced for hierarchy children (will fix issue #100)
Only one file per measurement
Disadvantages:
Some refactoring required in dclab. In the end, we could drop support for tdms in dclab entirely and only rely on .rtdc, which would be cleaner.
A hdf5 file format standard will minimize the memory footprint and the processing time for RT-DC data sets. It allows to only load the data that is needed at the time (slicing of data on disk), which will result in a much more responsive user interface.
I propose the following workflow:
Advantages:
Disadvantages: