How to use other data sets to train the model

Sreyan88 / MMER

Code for the InterSpeech 2023 paper: MMER: Multimodal Multi-task learning for Speech Emotion Recognition

66 stars 14 forks source link

How to use other data sets to train the model #10

Closed zeyan-liu closed 3 months ago

zeyan-liu commented 5 months ago

Dear author, hello! When using it, I found that all downloaded features are used. I want to try to train the model from other data sets, but I don't know how to extract features (such as tokenized_word, bert_text, bert_output, batch_label, roberta). How to extract these features from other data sets?

Sreyan88 commented 5 months ago

Thank You for your interest!

I think only the Roberta embeddings are required! You can easily generate them using huggingface! Just pass the text through Roberta and save the embeddings of the last layer! Let me know if you face any problems or if you require a script for the same!

Sreyan88 commented 3 months ago

Closing the issue due to lack of inactivity! Please also see issue #11 and feel free to open if any help is required.