microsoft / cognitive-services-speech-sdk-js

Microsoft Azure Cognitive Services Speech SDK for JavaScript
Other
252 stars 91 forks source link

[Bug]: result.text property set to '.' on recognized event when performing speech translation with Arabic languages. #792

Closed stevebrisson-librestream closed 2 months ago

stevebrisson-librestream commented 4 months ago

What happened?

When using the Speech SDK Translation Recognizer in continuous mode, the final result.text that is returned in the recognized callback is '.', instead of the recognized phrase. This occurs when I am translating from some Arabic dialects to English. This does not happen with other recognition languages that I have tried (French, English, Russian, Chinese) but I've not done exhaustive testing.

Console output of speech translation from 'ar-ae' to 'en':

Recognizing event, result.text= تق , translation= Taq Recognizing event, result.text= تقع القاهرة , translation= Cairo is located Recognizing event, result.text= تقع القاهرة على , translation= Cairo is located on Recognizing event, result.text= تقع القاهرة على جوان , translation= Cairo is located on June Recognizing event, result.text= تقع القاهرة على جوانب جز , translation= Cairo is located on the sides of the Recognizing event, result.text= تقع القاهرة على جوانب جزر نهر , translation= Cairo is located on the sides of the River Islands Recognizing event, result.text= تقع القاهرة على جوانب جزر نهر النيل , translation= Cairo is located on the sides of the Nile River Islands Recognizing event, result.text= تقع القاهرة على جوانب جزر نهر النيل في , translation= Cairo is located on the sides of the Nile River Islands in Recognizing event, result.text= تقع القاهرة على جوانب جزر نهر النيل في شمال مصر , translation= Cairo is located on the sides of the Nile River Islands in northern Egypt Recognized event, result.text= . , translation= Cairo is located on the sides of the Nile River Islands in northern Egypt

This is reproducible 100% of the time that I've tried it, and I've also confirmed that it is an issue with the .NET Microsoft.CognitiveServices.Speech Nuget package.

Version

1.34.0 (Latest)

What browser/platform are you seeing the problem on?

Microsoft Edge

Relevant log output

No response

glharper commented 4 months ago

@stevebrisson-librestream Thank you for using Speech SDK and writing this issue up. Testing on EastUS2, with the attached wav file, I was not able to reproduce this. Screenshot attached. synth.wav.zip

Screenshot 2024-02-08 at 8 42 14 AM
stevebrisson-librestream commented 4 months ago

@glharper Thanks for the fast response.

I've retested with Azure Speech service instances in both East US and East US2 and I can always reproduce this. I tried microphone capture of the audio file you provided and I can reproduce this. I tried using the audio file instead of microphone and I can reproduce this.

What language have you set as the recognition language? Sorry I should have emphasized this more, the bug only happens with specific dialects (ex. ar-AE has the bug, but ar-EG does not).

Can you retest with the recognition language set to 'ar-ae'?

glharper commented 4 months ago

@stevebrisson-librestream Weird, it's happening now for me as well for ar-AE. The info from the service has "." in the Text field, so this appears to be a service issue. I'll ping the translation team and let you know once I have an ETA for a fix.

stevebrisson-librestream commented 4 months ago

@glharper Thanks!

glharper commented 4 months ago

Checking in on this, the service contact could not reproduce, but has a connection id to find out what's happening on their end.

glharper commented 4 months ago

@stevebrisson-librestream Is this still reproducing for you? I can ping the service team again and ask for an update if so.

stevebrisson-librestream commented 3 months ago

@glharper There still seems to be something wrong.

When retesting today with recognition language set to 'ar-AE' and target language set to 'en' , the text is no longer disappearing but instead the punctuation is missing from both the recognized Arabic text and the translated English text.

If I change the recognition language to 'ar-EG' then the punctuation is present, so it seems like there is still an issue with certain Arabic dialects.

glharper commented 2 months ago

@stevebrisson-librestream JS Speech SDK does not have a way to correct the wrong punctuation from the service for specific Arabic dialects, so I'm closing this issue. Is this is still affecting you, please email me at (at)microsoft(dot)com and I'll put you in touch with my service contacts.