Open frankenjoe opened 1 year ago
If we manage to store all metadata and all possible features inside the binary file (e.g. HDF5), we would need to make only minor changes to audformat
. We only take the file format into account when calling audformat.Database.files_duration()
, audformat.utils.duration()
, audformat.utils.to_filewise_index()
or audformat.utils.to_segmented_index()
.
It would be indeed great to choose a format that is also fast to read from (at least comparable to WAV), then we can avoid caching the feature files as pickle in audb
.
Storing all features in a single file brings the disadvantage that the single files could become quite large, but at the moment I would also vote for it.
BTW audformat.utils.to_filewise_index()
does not work with all supported files in audformat
anyway as it uses audiofile.write()
which will not work for videos or MP3 files.
We currently target annotation of audio and video files, but sometimes it might be not possible to keep the raw media. In that case it would be great to support other representations, e.g. spectrograms.
To add such support we have to decide on a binary format that is able to store floating point arrays and meta information like feature name and sampling rate, and comes with libraries for all common programming languages. HDF5 might be a good candidate, but we should also have a look at new formats like
arrow
orparquet
, which might provide faster access. It should also be possible to store several feature representations in the same file asaudformat
supports only a single file column.