microsoft / cognitive-services-speech-sdk-js

Microsoft Azure Cognitive Services Speech SDK for JavaScript
Other
267 stars 100 forks source link

[Bug]: AutoDetectSourceLanguageConfig with TranslationRecognizer seems to not works #742

Closed Richou closed 10 months ago

Richou commented 1 year ago

What happened?

I'm trying to use the TranslationRecognizer with the AutoDetectSourceLanguage feature here is the initialize of the recognizer :

const v2EndpointInString = `wss://${config.region}.stt.speech.microsoft.com/speech/universal/v2`
const v2EndpointUrl = new URL(v2EndpointInString)
const speechConfig = SpeechTranslationConfig.fromEndpoint(v2EndpointUrl, config.key)
// Or with
// const speechConfig = SpeechTranslationConfig.fromSubscription(config.key, config.region)
speechConfig.speechRecognitionLanguage = 'fr-FR'

const autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig.fromLanguages(['fr-FR', 'en-US'])
autoDetectSourceLanguageConfig.mode = LanguageIdMode.Continuous

speechConfig.addTargetLanguage('fr')
speechConfig.addTargetLanguage('en')

// audioconfig is read from an audio stream
const audioConfig = AudioConfig.fromStreamInput(audioPushStream)
const speechRecognizer = TranslationRecognizer.FromConfig(speechConfig, autoDetectSourceLanguageConfig, audioConfig)

speechRecognizer.recognizing = (sender: Recognizer, event: TranslationRecognitionEventArgs) => {
  // The translations object in event.result is undefined
} 

speechRecognizer.startContinuousRecognitionAsync()

Please note on the code above, the this.speechConfig.speechRecognitionLanguage = 'fr-FR' if I don't set this field an error throwIfNullOrUndefined:SpeechServiceConnection_RecoLanguage is thrown.

My problem is that the object that should contains the translations event.result.translations is undefined

But if I don't use the autoDetectSourceLanguageConfig :

const speechConfig = SpeechTranslationConfig.fromSubscription(config.key, config.region)
speechConfig.speechRecognitionLanguage = 'fr-FR'

const autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig.fromLanguages(['fr-FR', 'en-US'])
autoDetectSourceLanguageConfig.mode = LanguageIdMode.Continuous

speechConfig.addTargetLanguage('fr')
speechConfig.addTargetLanguage('en')

// audioconfig is read from an audio stream
const audioConfig = AudioConfig.fromStreamInput(audioPushStream)
const speechRecognizer = TranslationRecognizer.FromConfig(speechConfig, autoDetectSourceLanguageConfig, audioConfig)

speechRecognizer.recognizing = (sender: Recognizer, event: TranslationRecognitionEventArgs) => {
  // I have the translations object with correct translated values
} 

speechRecognizer.startContinuousRecognitionAsync()

Am I doing something wrong ? Is this feature available with Javascript SDK ?

I got inspired with the code : https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/csharp/sharedcontent/console/translation_samples.cs from the method TranslationWithMultiLingualFileAsync_withLanguageDetectionEnabled but translated in typescript/javascript.

Platform, Operating System, and Programming Language

Version

1.31.0 (Edge)

What browser/platform are you seeing the problem on?

Node

Relevant log output

No response

glharper commented 1 year ago

@Richou Thank you for using JS Speech SDK, and writing this issue up. There are a few translation bugs with continuous language id that I've put time into fixing this iteration, and one is the recognizing callback. Our next release, 1.33, due at the end of October, should fix this issue (the merged PR is here.)

If you'd like to give the fix a go before then, either clone the repo and run "npm install && npm pack" to create the binary, or send me an email at (at)microsoft(dot)com and I can send you a tar of the current master.

Richou commented 1 year ago

@glharper Thank you for your answer, I can wait until the end of october. I will try with the new version and tell if that fixed my issue.

Tank you !

glharper commented 10 months ago

Fixed as of 1.34.0, if not before