QutEcoacoustics / audio-analysis

The audio analysis code (AnalysisPrograms.exe) for the QUT Ecoacoustics Research Group
https://ap.qut.ecoacoustics.info/
Apache License 2.0
52 stars 12 forks source link

Investigate binary encoding serialization for data files #82

Open atruskie opened 8 years ago

atruskie commented 8 years ago

Add a global option to enable binary encoding of output data.

In particular, CSV files can be very inefficient in terms of storage space.

This will probably best be implemented by creating a wrapper around the current CSV serializer functionality that intercepts binary encoding when deserializing and serializing.

Also should be enable-able via a global switch to the program.

Also investigate truncating precision of values (almost all of our data barely even needs single precision, let alone double).

Update: we're strongly leaning towards HDF5 as the chosen format because it has strong support in other sciences as well as all the platforms we care about

atruskie commented 5 years ago

Recently the Arrow format is now viable and will be a lot nicer to work with than HDF5