HappyColor / SpeechFormer

Official implement of SpeechFormer written in Python (PyTorch).
75 stars 7 forks source link

Audio Segmentation: Rules and Procedure #4

Closed kingformatty closed 1 year ago

kingformatty commented 1 year ago

Hi, Thanks for sharing the codebase.

For DAIC-WOZ, the example metadata csv file has a mid-tag for each audio name which I assume they are different segments. Is it possible to share the audio segmentation scripts / procedures? Since for similar reproduction purpose, statistics of segment feed into the model is really crucial.

HappyColor commented 1 year ago

For DAIC-WOZ, XXX_sY_AUDIO.wav denotes the Y-th utterance of the participant XXX. I'm sorry that my data processing was done a long time ago and I can't find the corresponding code now. To make up for this, I have detailed how each .csv file is generated and what they mean in the ./metadata directory.