Question about Validation

ECNU-Cross-Innovation-Lab / ShiftSER

[ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations

MIT License

33 stars 2 forks source link

Thank you for your interest in our work. IEMOCAP dataset is the maintream dataset for speech emotion recognition. Since this dataset is small and consists of 5 sessions, the common evaluation protocal is 5-fold cross validation (or leave-one-session-out). For each fold, four sessions are used for training while one session is used for testing. The final result is the average of 5 folds. You can also use 3 sessions for training, 1 session for validation, and 1 session for testing. You can refer this paper for other reasonable validation choice on IEMOCAP.

ECNU-Cross-Innovation-Lab / ShiftSER

Question about Validation #1