Open Luyizhe opened 2 years ago
Hello, I want to try data augmentation, but I don’t have consistent features. By reading your paper "Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos", I find your way to transformer 6373 dimensions to 100 dimensions by using FC layer. But I can't get appropriate matrix weights. Can you share the weights? Thank you!
We used openSMILE and then fed that to an FC network with 100-dim output. This FC network can be trained using your training dataset's labels. Alternatively you can use other audio features as shown here: https://github.com/soujanyaporia/MUStARD
We used openSMILE and then fed that to an FC network with 100-dim output. This FC network can be trained using your training dataset's labels. Alternatively you can use other audio features as shown here: https://github.com/soujanyaporia/MUStARD
Hi, thanks for your clarification. Could you please share the scripts of dimension reduction process? I am trying to replicate the feature extraction but having trouble with the FC network settings for dimension reduction.
BTW, may I know why the librosa feature are with different size for each audio utterance? Thank you!
Hello, Can you share the way you extract audio features in the work "Multi-level Multiple Attentions for Contextual Multimodal Sentiment Analysis"? I have no idea that how to extract 100 dimensions sentence-level audio features. Thank you !