declare-lab / MELD

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation
GNU General Public License v3.0
788 stars 200 forks source link

Using pre-trained models with our own audio files. #12

Closed vsakkas96 closed 5 years ago

vsakkas96 commented 5 years ago

Hi, thanks for making this open source project.

I'm trying to use the pre-trained models you provide on my own audio files in order to extract the emotion and sentiment labels, but the baseline.py does not seem to provide a way to use my own files.

Moreover, the baseline.py file loads .pkl instead of .wav or .mp4 files. How would I go about using my own files an generating similar .pkl files in order to be used with the pre-trained models?

Thanks.

sanzgiri commented 5 years ago

I am also interested in doing this. Would you be willing to share the code that extracts the text embeddings (Glove + dimensional reduction) & audio embeddings (OpenSmile + dimensional reduction)? Thanks!

devamanyu commented 5 years ago

Hi,

We encourage users to explore novel methods of feature selection to fit our framework. To follow our basic process mentioned in the paper, users can utilize open-sourced scripts like scikit for feature selection and opensmile for extraction of audio functionals.

Thanks!