Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.87k stars 1.85k forks source link

Text to Speech SDK is not working after deployed in to Azure #1643

Closed kumarsenkan closed 2 years ago

kumarsenkan commented 2 years ago

I am new to python and as well as azure platform.Below is my code for Text to speech in japanese. It is working fine in local environment, after deployed in to azure it is returning success result but audio is not playing. Do i need to setup anything for audio in Azure static web platform?

get_synthesis()

    self.speech_config = speechsdk.SpeechConfig(subscription=config.get('key'), region=config.get('region'))
    self.audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
    # Set either the `SpeechSynthesisVoiceName` or `SpeechSynthesisLanguage`.
    self.speech_config.speech_synthesis_language = config.get('language')
    # The language of the voice that speaks.
    self.speech_config.speech_synthesis_voice_name=config.get('voiceName')
    self.speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=self.speech_config, audio_config=self.audio_config)
    self.speech_synthesis_result = self.speech_synthesizer.speak_text_async(text).get()

    # code for synthersize audio
    if self.speech_synthesis_result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
        logger.info(self.speech_synthesis_result.reason)
        result = "success"
    elif self.speech_synthesis_result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = self.speech_synthesis_result.cancellation_details
        logger.info("Speech synthesis canceled: {}".format(cancellation_details.reason))
        result = "canceled"
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            if cancellation_details.error_details:
                logger.warning("Error details: {}".format(cancellation_details.error_details))
                logger.warning("Did you set the speech resource key and region values?")
                result = cancellation_details.error_details
jhakulin commented 2 years ago

@kumarsenkan Could you please provide more information, what is not working? What is the detailed error message you receive in the Cancellation event ? When you say deployed to Azure, do you mean that you are targeting to run this as Azure function ?

kumarsenkan commented 2 years ago

@kumarsenkan Could you please provide more information, what is not working? What is the detailed error message you receive in the Cancellation event ? When you say deployed to Azure, do you mean that you are targeting to run this as Azure function ?

Thanks for reply. Audio is not playing after deployed to azure

jhakulin commented 2 years ago

@kumarsenkan Could you please clarify what kind of user scenario you are targeting? When running SDK code in Azure, there are no speaker which you can listen to, there may not be even audio drivers installed, which can lead to cancellation error in your sample where you have used default speaker audio configuration.

You can output the synthesis e.g. to stream by following this sample https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/python/console/speech_synthesis_sample.py#L210

jhakulin commented 2 years ago

Closing the issue as answered, please let us know if more information is needed.

kumarsenkan commented 2 years ago

User scenario is we are developing react web app for translating user speech from english to japanese output. We have own system for transcript and translate english voice to japanese text. That japanese text need to be audio output using MS speach SDK. It is working fine in local environment. After deployed into Azure platform static web app i am unable to hear voice output. Problem still exists

jhakulin commented 2 years ago

@kumarsenkan if you need to provide synthesized audio output as input to some component in Azure, you need to create output for synthesis to stream or file as Azure machines do not have any loudspeakers which could be used in your local environment. Currently your script uses AudioOutputConfig(use_default_speaker=True) which should be changed to stream or file output. Please check the earlier comment which linked to one sample which provides synthesis as stream output, and could you please try that?