audeering / opensmile

The Munich Open-Source Large-Scale Multimedia Feature Extractor
https://audeering.github.io/opensmile/
Other
584 stars 75 forks source link

What are the valid output range of feature values? #9

Closed pinakin9526 closed 3 years ago

pinakin9526 commented 3 years ago

I have used opensmile to extract features from audio using "ComParE2016" . That returned 6373 features. I have converted that to numpy and plotted it. So I wanna know that is it correct or are those outliers? because i got some values which are multiple of 10^13 and many other which are just below 5000.

chausner-audeering commented 3 years ago

Hello pinakin9526,

the feature vector of ComParE2016 functionals consists of different kinds of features, and each in different value ranges. It is expected that some may get very large and others very small. For this reason, it is usually a good idea to normalize them before training a model.

If you want to plot features over time, you can either compute ComParE2016 functionals on smaller windows and then plot individual features over time, or extract ComParE2016 low-level descriptors which give you a feature sequence that you can plot.

pinakin9526 commented 3 years ago

Thanks @chausner-audeering , problem is resolved now. So I was just observing the features of single audio file without normalization. The output was as follows,

opensmile

But then i collected feature vector of multiple audio files in one and then performed normalization across all those file then i got plot of those 6373 features like as follows.

image

So all other features are normalized with respect to other file's features. So we have to see a particular feature across multiple files.

I actually came to this confusion/problme while designing the following pipeline

audiofile.wav -> audio processing -> opensmile Feature extraction ->Normalization

So got confused in the Normalization part that how do i normalize single audio feature?

but now i am following following pipeline

multiple audio files -> audio processing -> opensmile Feature extraction -> feature vectors of multiple audio files -> normalize

Thanks again @chausner-audeering , I am closing this issue