m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
12.65k stars 1.34k forks source link

emotion detection #384

Open MyraBaba opened 1 year ago

MyraBaba commented 1 year ago

Hi

Is there any way to detect emotion, stress of the speaker within the whisper domain ?

Best

code-switched commented 1 year ago

Hi

Is there any way to detect emotion, stress of the speaker within the whisper domain ?

Best

The github repo named Whisper.cpp has a feature called "Confidence color-coding" --print-colors

It can be used to highlight words with high or low confidence and that's the closest I've seen to emotion detection.

sorgfresser commented 1 year ago

Another way would be sentiment analysis on the transcript in a postprocessing step. Maybe have a look at BERT or newer versions for this.

chenrq2005 commented 1 year ago

https://arxiv.org/pdf/2306.12991.pdf

https://huggingface.co/speechbrain/emotion-diarization-wavlm-large