Open marcenacp opened 1 year ago
I think we should look into adding support for https://schema.org/VideoObject and plain binary files at some point too.
I had a look at some audio libraries, here are my thoughts. In short: I'm in favor of using librosa
.
Things in common:
0.*
version.soundfile
or audioread
.It looks to me that librosa
and pydub
are the two most used Python libraries for audio processing. pydub
was last released in 2021 while librosa
has been steadily updated. Given that librosa
also has a better documentation, I'd recommend using it.
I also had a look at the most popular audio datasets from Hugging Face and Papers With Code. They all use either FLAC or WAV audio formats. The only exception is Common Voice which uses MP3.
Hey, I notice that in #242 , one of the attributes that we look into is the bitrate. What do we do if there are multiple bitrates, due to there being multiple mp3 files?
Proposal:
We propose to handle audio features using https://schema.org/AudioObject.
Technical strategy:
pass-mini
.sc:AudioObject
in_src/core/constants.py
_src/operation_graph/operations/field.py
. We have to choose a library to handle audio. We recommend choosing betweenlibrosa
orsounddevice
orpydub
. Before choosing the library, make pros and cons of the library, and publish here to have the validation of a maintainer.Known supported data types:
.This can be split in several PRs.