Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.95k stars 1.86k forks source link

Creating SpeechSynthesizer hangs forever if rpyc/websockets is imported #2530

Closed l-jc closed 2 months ago

l-jc commented 3 months ago

IN ORDER TO ASSIST YOU, PLEASE PROVIDE THE FOLLOWING:

Describe the bug

speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=None) hangs forever, if I import rpyc/websockets in the code.

To Reproduce

Code that works.

import azure.cognitiveservices.speech as speechsdk
# import rpyc
# import websockets.sync.server

speech_config = speechsdk.SpeechConfig(endpoint="ws://localhost:5000/cognitiveservices/websocket/v2")

# Set the voice name, refer to https://aka.ms/speech/voices/neural for full list.
speech_config.speech_synthesis_voice_name = "en-US-AvaNeural"
properties = dict()
properties["SpeechSynthesis_FrameTimeoutInterval"] = "100000"
properties["SpeechSynthesis_RtfTimeoutThreshold"] = "10"
speech_config.set_properties_by_name(properties)

# Creates a speech synthesizer using the default speaker as audio output.
print("Creating speech synthesizer")
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=None)
print("Created speech synthesizer")

def OnReceiveAudio(evt):
    result = evt.result
    print("Audio Length: {}".format(len(result.audio_data)))

speech_synthesizer.synthesizing.connect(OnReceiveAudio)

request = speechsdk.SpeechSynthesisRequest(
    input_type=speechsdk.SpeechSynthesisRequestInputType.TextStream
)
speak_task = speech_synthesizer.speak_async(request)
textArray = [
    "Hello, how are you doing?",
    "I'm doing well, thank you.",
    "Goodbye!",
]  # The text array to be synthesized.
for text in textArray:
    request.input_stream.write(text)

request.input_stream.close()
result = speak_task.get()
print(result)

Output is

Creating speech synthesizer
Created speech synthesizer
Audio Length: 39864
Audio Length: 38264
Audio Length: 40996
Audio Length: 38266
Audio Length: 5506
Audio Length: 3380
SpeechSynthesisResult(result_id=6c4df6b942e34914a8dd67145cbcad26, reason=ResultReason.SynthesizingAudioCompleted, audio_length=166046)

If I uncomment the import rpyc or import websockets.sync.server in the code above, I got the following output and the program hangs there forever.

Creating speech synthesizer

Expected behavior

The above code can work with rpyc/websockets import.

Version of the Cognitive Services Speech SDK

1.38.0

Platform, Operating System, and Programming Language

Additional context

yulin-li commented 3 months ago

I cannot reproduce the bug, my steps:

Could you share an environment that could repro this issue?

pankopon commented 2 months ago

Closed since not repro and no more info received, please re-open if there are further updates.