Azure-Samples / AzureSpeechReactSample

This sample shows how to integrate the Azure Speech service into a sample React application. This sample shows design pattern examples for authentication token exchange and management, as well as capturing audio from a microphone or file for speech-to-text conversions.
MIT License
126 stars 77 forks source link

Problem with custom endpoint when auto detecting the language #2

Open adriensas opened 3 years ago

adriensas commented 3 years ago

This issue is for a:

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

I'm trying to use an autoDetectConfig with custom model endpoint, I've modified sttFromMic in src/App.js as follow :

   async sttFromMic() {
        const tokenObj = await getTokenOrRefresh();
        const speechConfig = speechsdk.SpeechConfig.fromAuthorizationToken(tokenObj.authToken, tokenObj.region);

        var enLanguageConfig = speechsdk.SourceLanguageConfig.fromLanguage("en-US");
        var frLanguageConfig = speechsdk.SourceLanguageConfig.fromLanguage("fr-FR", "b9a605f6-0a51-4ffa-9bda-c9ca9e951cb2");
        var autoDetectConfig = speechsdk.AutoDetectSourceLanguageConfig.fromSourceLanguageConfigs([enLanguageConfig, frLanguageConfig]);

        const audioConfig = speechsdk.AudioConfig.fromDefaultMicrophoneInput();
        const recognizer = speechsdk.SpeechRecognizer.FromConfig(speechConfig, autoDetectConfig, audioConfig);

        this.setState({
            displayText: 'speak into your microphone...'
        });

        recognizer.recognizeOnceAsync(result => {
            let displayText;
            if (result.reason === ResultReason.RecognizedSpeech) {
                displayText = `RECOGNIZED: Text=${result.text}`
            } else {
                displayText = 'ERROR: Speech was cancelled or could not be recognized. Ensure your microphone is working properly.';
            }

            this.setState({
                displayText: displayText
            });
        });
    }

But it does not seems to use my custom model in french : I don't get the correct transcription (I have the standard transcription not the one from my custom model) and I don't get the corresponding logs in my custom model. I tried in python with the same auth token and it works (I get the correct transcription :

import azure.cognitiveservices.speech as speechsdk

def from_mic():
    en_language_config = speechsdk.languageconfig.SourceLanguageConfig("en-US")
    fr_language_config = speechsdk.languageconfig.SourceLanguageConfig("fr-FR", 'b9a605f6-0a51-4ffa-9bda-c9ca9e951cb2')
    auto_detect_source_language_config = speechsdk.languageconfig.AutoDetectSourceLanguageConfig(sourceLanguageConfigs=[en_language_config, fr_language_config])
    #speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
    speech_config = speechsdk.SpeechConfig(auth_token=token, region=service_region)
    speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, auto_detect_source_language_config=auto_detect_source_language_config)

    print("Speak into your microphone.")
    result = speech_recognizer.recognize_once_async().get()
    print(result.text)

from_mic()

Any log messages given by the failure

I don't get any log or error

Expected/desired behavior

When detecting french it should use my custom model as described in https://docs.microsoft.com/fr-fr/azure/cognitive-services/speech-service/how-to-automatic-language-detection?pivots=programming-language-javascript

OS and Version?

Mac OS Big Sur

Versions

"microsoft-cognitiveservices-speech-sdk": "^1.17.0"