Misclassified speakers when using Batch Transcription

I am able to generate a transcript for my audio files that are stored in a blob container via the batch processing SAS URL method. The “classified words” accuracy is very high however the speakers are often misclassified. For example, if speaker 2 speaks immediately after speaker 1 then the “text” of speaker 2 gets appended to what speaker 1 said.

Another issue is that if one speaker is asking a question and another one is answering it and there isn’t a long enough pause then the answer gets appended to the speaker who asked the question.

Can someone please assist with this? Thanks

Azure-Samples / cognitive-services-speech-sdk

Misclassified speakers when using Batch Transcription #2409