How to process the raw audio on IEMOCAP?

FriedaSmith commented 3 years ago

Hi. How to process the raw audio in IEMOCAP? Can you upload the code for processing IEMOCAP?

TideDancer commented 3 years ago

Hello, thanks for your interests.

There is no special processing of the audio. Just leave the *.wav file from the dataset. The code will use 'soundfile' package to read the waves.
Unfortunately I arrange the files manually in steps so I don't have a script to automate that. But it should not be hard to do: (1) Once you got the dataset, you can find all wav files at /IEMOCAP_path/Session/sentences/wav/, as well as emotion labels and transcripts at /IEMOCAP_path/Session/dialog/EmoEvaluations and transcriptions. (2) Build 10 set of csv files, as explained in this repo's README. In each file, put the path to the wavs, the emotion label, and the transcripts. Example: iemocap_01F.test.csv, contains items in Ses01F*, and iemocap_01F.train.csv has all the other items. NOTE: (a) exc and hap are merged into a single label. (b) only choose hap+exc, sad, ang, neu labels and their corresponding wavs.

Let me know if this is clear and if you need any other helps on this.

FriedaSmith commented 3 years ago

Thank you very much. Your paper has not been published yet. Can you upload the paper? It gives me a better understanding for your repository.

Speech Emotion Recognition with Multi-task Learning, X. Cai et al., INTERSPEECH 2021

TideDancer commented 3 years ago

Just uploaded in ./paper/. Please let me know if anything else I can help with.

FriedaSmith commented 3 years ago

Thank you very much for your help

TideDancer / interspeech21_emotion