audeering / audformat

Format to store media files and annotations
https://audeering.github.io/audformat/
Other
10 stars 0 forks source link

Feature support #321

Open frankenjoe opened 1 year ago

frankenjoe commented 1 year ago

We currently target annotation of audio and video files, but sometimes it might be not possible to keep the raw media. In that case it would be great to support other representations, e.g. spectrograms.

To add such support we have to decide on a binary format that is able to store floating point arrays and meta information like feature name and sampling rate, and comes with libraries for all common programming languages. HDF5 might be a good candidate, but we should also have a look at new formats like arrow or parquet, which might provide faster access. It should also be possible to store several feature representations in the same file as audformat supports only a single file column.

hagenw commented 1 year ago

If we manage to store all metadata and all possible features inside the binary file (e.g. HDF5), we would need to make only minor changes to audformat. We only take the file format into account when calling audformat.Database.files_duration(), audformat.utils.duration(), audformat.utils.to_filewise_index() or audformat.utils.to_segmented_index().

It would be indeed great to choose a format that is also fast to read from (at least comparable to WAV), then we can avoid caching the feature files as pickle in audb.

Storing all features in a single file brings the disadvantage that the single files could become quite large, but at the moment I would also vote for it.

hagenw commented 1 year ago

BTW audformat.utils.to_filewise_index() does not work with all supported files in audformat anyway as it uses audiofile.write() which will not work for videos or MP3 files.