ECNU-Cross-Innovation-Lab / ShiftSER

[ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations
https://www.researchgate.net/publication/371101522
MIT License
33 stars 2 forks source link

Question about Validation #1

Open ASolitaryMan opened 10 months ago

ASolitaryMan commented 10 months ago

Hi guys, Your work is very innovative and interesting. I read your code, and find that there is no validation set. What is the reason for this?

Trert111 commented 9 months ago

Thank you for your interest in our work. IEMOCAP dataset is the maintream dataset for speech emotion recognition. Since this dataset is small and consists of 5 sessions, the common evaluation protocal is 5-fold cross validation (or leave-one-session-out). For each fold, four sessions are used for training while one session is used for testing. The final result is the average of 5 folds. You can also use 3 sessions for training, 1 session for validation, and 1 session for testing. You can refer this paper for other reasonable validation choice on IEMOCAP.