Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.82k stars 1.83k forks source link

Threads left hanging after synthesizer is deleted #2545

Closed kormang closed 1 month ago

kormang commented 1 month ago

Speech SDK starts some native thread when speech synthesis is requested. These threads are kept alive even after synthesizer is deleted and garbage collected.

To Reproduce

Steps to reproduce the behavior:

  1. Run the following code:
import os
import subprocess
import azure.cognitiveservices.speech as speechsdk
import threading

def print_thread_count(msg: str = ""):
    sys_th_count = len(subprocess.check_output(["ps", "-T", "-p", str(os.getpid())]).splitlines())
    print(msg, "- active python threads:", threading.active_count(), "; system thread count: ", sys_th_count)

class WriterCallback(speechsdk.audio.PushAudioOutputStreamCallback):
    def write(self, buffer: memoryview) -> int:
        return len(buffer)

    def close(self) -> None:
        pass

def create_synthesizer():
    compl_evt = threading.Event()

    def on_completed(eargs: speechsdk.SpeechSynthesisEventArgs):
        compl_evt.set()

    # This example requires environment variables named "SPEECH_KEY" and "SPEECH_REGION"
    speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('SPEECH_KEY'), region=os.environ.get('SPEECH_REGION'))
    audio_config = speechsdk.audio.AudioOutputConfig(stream=speechsdk.audio.PushAudioOutputStream(WriterCallback()))
    audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)

    # The neural multilingual voice can speak different languages based on the input text.
    speech_config.speech_synthesis_voice_name='en-US-AvaMultilingualNeural'

    speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)

    speech_synthesizer.synthesis_completed.connect(on_completed)

    return speech_synthesizer, compl_evt

speech_synthesizer, compl_evt = create_synthesizer()
def do_speech() -> bool:

    print_thread_count("before input")
    print("Enter some text that you want to speak >")
    text = input()

    if text == "":
        print("Exit")
        return False

    print_thread_count("before speak")
    result_future = speech_synthesizer.speak_text_async(text)
    print_thread_count("after speak")

    compl_evt.wait()
    compl_evt.clear()
    print_thread_count("after wait")

    # Uncomment this to fix the issue.
    # result_future.get()

    return True

while do_speech():
    print_thread_count("loop...")

print_thread_count("Cleanup...")
del speech_synthesizer
print_thread_count("After del")
# At this point number of threads will be about +6 higher then before the while loop,
# unless result_future.get is called for each result_future.
  1. Observe that the number of native threads did not decrease at the end of the script (just press enter to exit the script).
  2. Ucomment # result_future.get() and run again.
  3. Observe that the number of native threads did decrease to previous value at the end of the script.

Similar problem can be observed with recognizer too.

pankopon commented 1 month ago

Duplicate of https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/2506