Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.68k stars 1.79k forks source link

Misclassified speakers when using Batch Transcription #2409

Open egyptiangodraw opened 3 weeks ago

egyptiangodraw commented 3 weeks ago

I am able to generate a transcript for my audio files that are stored in a blob container via the batch processing SAS URL method. The “classified words” accuracy is very high however the speakers are often misclassified. For example, if speaker 2 speaks immediately after speaker 1 then the “text” of speaker 2 gets appended to what speaker 1 said.

Another issue is that if one speaker is asking a question and another one is answering it and there isn’t a long enough pause then the answer gets appended to the speaker who asked the question.

Can someone please assist with this? Thanks

github-actions[bot] commented 6 days ago

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.