Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.96k stars 1.86k forks source link

Armbian aarch64/arm64 #1969

Closed dzianisv closed 1 year ago

dzianisv commented 1 year ago

Describe the bug I trying to use Python Cognitive Service Speech SDK on Armbian ubuntu on OrangePi4 LTS. This is a bare-metal Ubuntu image. aplay and arecord show input and output audio devices. When I run my script on macBook x86_64, it works well, but it Speech SDK library crashes during initialization on Ubuntu on OrangePI4.

Traceback (most recent call last):
  File "/opt/AssistantPlato/./src/test.py", line 80, in <module>
    speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_output_config)
  File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/speech.py", line 2149, in __init__
    _call_hr_fn(fn=_sdk_lib.synthesizer_create_speech_synthesizer_from_config, *[
  File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/interop.py", line 62, in _call_hr_fn
    _raise_if_failed(hr)
  File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/interop.py", line 55, in _raise_if_failed
    __try_get_error(_spx_handle(hr))
  File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/interop.py", line 50, in __try_get_error
    raise RuntimeError(message)
RuntimeError: Exception with error code: 
[CALL STACK BEGIN]

/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1d764c) [0xffff80b6764c]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1eb9a8) [0xffff80b7b9a8]
/lib/aarch64-linux-gnu/libc.so.6(+0x825d4) [0xffff817625d4]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1ec920) [0xffff80b7c920]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1ac2a4) [0xffff80b3c2a4]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1abb78) [0xffff80b3bb78]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1d9288) [0xffff80b69288]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1c3644) [0xffff80b53644]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1be070) [0xffff80b4e070]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0xeb7a8) [0xffff80a7b7a8]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1d9288) [0xffff80b69288]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1b759c) [0xffff80b4759c]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x204814) [0xffff80b94814]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(synthesizer_create_speech_synthesizer_from_config+0x10c) [0xffff80a4a77c]
/lib/aarch64-linux-gnu/libffi.so.8(+0x6e10) [0xffff81506e10]
/lib/aarch64-linux-gnu/libffi.so.8(+0x3a94) [0xffff81503a94]
/usr/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so(+0x12b10) [0xffff81532b10]
[CALL STACK END]

To Reproduce

#!/usr/bin/env python3
import os
import azure.cognitiveservices.speech as speechsdk
import openai

# Speech Services
speech_key = os.environ.get("AZURE_SPEECH_KEY")
speech_region = os.environ.get("AZURE_REGION")
language = "" #"en-US"
voice = "" #"en-US-JennyMultilingualNeural"

# Open Ai
openai.api_key = os.environ.get("OPENAI_KEY")

# Prompt
base_message = [{"role":"system","content":"You are an senior expert voice assistant who can answer all  related questions. You are friendly and concise. You only provide factual answers to queries, and do not provide answers that are not related to  products or ."}]

#######################
###### Functions ######
#######################
def ask_openai(prompt):
    base_message.append({"role": "user", "content": prompt})

    response = openai.ChatCompletion.create(
    engine="gpt-35-turbo",
    messages = base_message,
    temperature=0.24,
    max_tokens=50,
    top_p=0.95,
    frequency_penalty=0,
    presence_penalty=0,
    stop=None)

    text = response['choices'][0]['message']['content'].replace('\n', ' ').replace(' .', '.').strip()
    print('Azure OpenAI response:' + text)
    base_message.append({"role": "assistant", "content": text})
    speech_synthesis_result = speech_synthesizer.speak_text_async(text).get()

    if speech_synthesis_result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
        print("Speech synthesized to speaker for text [{}]".format(text))
    elif speech_synthesis_result.reason == speechsdk.ResultReason.Canceled:
        cancellation_details = speech_synthesis_result.cancellation_details
        print("Speech synthesis canceled: {}".format(cancellation_details.reason))
        if cancellation_details.reason == speechsdk.CancellationReason.Error:
            print("Error details: {}".format(cancellation_details.error_details))

def chat_with_open_ai():
    while True:
        print("Azure OpenAI is listening. Say 'Stop' or press Ctrl-Z to end the conversation.")
        try:
            speech_recognition_result = speech_recognizer.recognize_once_async().get()
            if speech_recognition_result.reason == speechsdk.ResultReason.RecognizedSpeech:
                text = speech_recognition_result.text
                if text == "Stop.":
                    print("Conversation ended.")
                    break
                if text == "Reset.":
                    print("Reset")
                    base_message = [{"role":"system","content":"You are an AI voice assistant that helps to answer questions."}]
                if "Hey" in text:
                    print("Recognized  speech: {}".format(speech_recognition_result.text))
                    ask_openai(speech_recognition_result.text)
            elif speech_recognition_result.reason == speechsdk.ResultReason.NoMatch:
                print("No speech could be recognized: {}".format(speech_recognition_result.no_match_details))
                break
            elif speech_recognition_result.reason == speechsdk.ResultReason.Canceled:
                cancellation_details = speech_recognition_result.cancellation_details
                print("Speech Recognition canceled: {}".format(cancellation_details.reason))
                if cancellation_details.reason == speechsdk.CancellationReason.Error:
                    print("Error details: {}".format(cancellation_details.error_details))
        except EOFError:
            break

speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=speech_region)
audio_output_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)
speech_config.speech_recognition_language=language
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)
speech_config.speech_synthesis_voice_name=voice
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_output_config)

try:
    chat_with_open_ai()
except Exception as err:
    print("Encountered exception. {}".format(err))

Expected behavior I would expect that SpeechRecognizer could be initalized.

Version of the Cognitive Services Speech SDK From Pipfile.lock

        "azure-cognitiveservices-speech": {
            "hashes": [
                "sha256:1beda741dd6e49564e9d970c606f1ea7e007969069842b19327abce119175774",
                "sha256:7cdc010e4a4586e78d3a265931520ca0422eed8145d37a7dd1e448e437945e08",
                "sha256:b189c4248b4adf69fc563abe30780c8f09bad6200414bb2b765abe2d44ed8c96",
                "sha256:cdba9d39fd9e7ee7cf49c54e8a3ba2b217469219b3a35cb3c993f32ca00d217c",
                "sha256:cea4984bd3fa582c41ba8ec79cd37f55380bb386e09ca155ae2aac343c8bdac5",
                "sha256:e9474e3ea19ed44f80c2aa7bc1c5c5e9711c5dbdf0662cb2eec31ff7821dbfa8"
            ],
            "index": "pypi",
            "version": "==1.29.0"
        },

Platform, Operating System, and Programming Language

Additional context

root@orangepi4-lts:/media/root# aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: rockchipes8316c [rockchip-es8316c], device 0: ff880000.i2s-ES8316 HiFi ES8316 HiFi-0 [ff880000.i2s-ES8316 HiFi ES8316 HiFi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: hdmisound [hdmi-sound], device 0: ff8a0000.i2s-i2s-hifi i2s-hifi-0 [ff8a0000.i2s-i2s-hifi i2s-hifi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
root@orangepi4-lts:/media/root# arecord -l
**** List of CAPTURE Hardware Devices ****
card 0: rockchipes8316c [rockchip-es8316c], device 0: ff880000.i2s-ES8316 HiFi ES8316 HiFi-0 [ff880000.i2s-ES8316 HiFi ES8316 HiFi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: hdmisound [hdmi-sound], device 0: ff8a0000.i2s-i2s-hifi i2s-hifi-0 [ff8a0000.i2s-i2s-hifi i2s-hifi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
yulin-li commented 1 year ago

libasound is required, did you install it (see this)

You can verify it by

ldd /root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.extension.audio.sys.so

dzianisv commented 1 year ago
root@orangepi4-lts:~# ldd /root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.extension.audio.sys.so
    linux-vdso.so.1 (0x0000ffff95964000)
    libasound.so.2 => /lib/aarch64-linux-gnu/libasound.so.2 (0x0000ffff957e0000)
    libMicrosoft.CognitiveServices.Speech.core.so => /root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so (0x0000ffff952e0000)
    libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000ffff952c0000)
    libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffff952a0000)
    libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff95070000)
    libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff95040000)
    libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff94e90000)
    /lib/ld-linux-aarch64.so.1 (0x0000ffff9592b000)
    libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff94df0000)
    libuuid.so.1 => /lib/aarch64-linux-gnu/libuuid.so.1 (0x0000ffff94dd0000)

libasound is installed. Due to the backtrace, crash happened at /root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1d764c) [0xffff80b6764c] when /root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(synthesizer_create_speech_synthesizer_from_config+0x10c) [0xffff80a4a77c] was called. I don't have debug symbols to debug it.

yulin-li commented 1 year ago

I reproed a crash yesterday on a nanopc-t4 board with Armbian, and that was a libasound2 issue. I installed libasound2 and then it worked fine.

So looks that it's a different issue. Could you try to disable the audio output by set audio_config to None and see if it work?

dzianisv commented 1 year ago

The same issue,

root@orangepi4-lts:/opt/AssistantPlato#  . /root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/bin/activate
(AssistantPlato) root@orangepi4-lts:/opt/AssistantPlato# ls src
main.py  test.py
(AssistantPlato) root@orangepi4-lts:/opt/AssistantPlato# ./src/test.py
bash: ./src/test.py: Permission denied
(AssistantPlato) root@orangepi4-lts:/opt/AssistantPlato# python3 ./src/test.py
Traceback (most recent call last):
  File "/opt/AssistantPlato/./src/test.py", line 82, in <module>
    speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=None)
  File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/speech.py", line 2149, in __init__
    _call_hr_fn(fn=_sdk_lib.synthesizer_create_speech_synthesizer_from_config, *[
  File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/interop.py", line 62, in _call_hr_fn
    _raise_if_failed(hr)
  File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/interop.py", line 55, in _raise_if_failed
    __try_get_error(_spx_handle(hr))
  File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/interop.py", line 50, in __try_get_error
    raise RuntimeError(message)
RuntimeError: Exception with error code: 
[CALL STACK BEGIN]

/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1d764c) [0xffff8580764c]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1eb9a8) [0xffff8581b9a8]
/lib/aarch64-linux-gnu/libc.so.6(+0x825d4) [0xffff863f25d4]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1ec920) [0xffff8581c920]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1ac2a4) [0xffff857dc2a4]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1abb78) [0xffff857dbb78]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1d9288) [0xffff85809288]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1c3644) [0xffff857f3644]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1be070) [0xffff857ee070]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0xeb7a8) [0xffff8571b7a8]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1d9288) [0xffff85809288]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1b759c) [0xffff857e759c]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x204814) [0xffff85834814]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(synthesizer_create_speech_synthesizer_from_config+0x10c) [0xffff856ea77c]
/lib/aarch64-linux-gnu/libffi.so.8(+0x6e10) [0xffff85ca6e10]
/lib/aarch64-linux-gnu/libffi.so.8(+0x3a94) [0xffff85ca3a94]
/usr/lib/python3.10/lib-dynload/_ctypes.cpython-310-aarch64-linux-gnu.so(+0x12b10) [0xffff85cd2b10]
[CALL STACK END]

Runtime error: Failed to initialize platform (azure-c-shared). Error: 2153
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=speech_region)
audio_output_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
audio_input_config = speechsdk.audio.AudioConfig(use_default_microphone=True)

speech_config.speech_recognition_language=language
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=None)

speech_config.speech_synthesis_voice_name=voice
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=None)
yulin-li commented 1 year ago

ok, I saw your Armbian is Jammy based.

Could you try to install openssl 1.1? The doc says The Speech SDK does not support OpenSSL 3.0, which is the default in Ubuntu 22.04.

dzianisv commented 1 year ago

Interesting, in this case why all the shared libraries dependencies are satisfied?

~/.local/share/virtualenvs# find . -name 'libMicrosoft*.so' | xargs -n1 ldd 
    linux-vdso.so.1 (0x0000ffff9853d000)
    libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000ffff97fe0000)
    libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffff97fc0000)
    libuuid.so.1 => /lib/aarch64-linux-gnu/libuuid.so.1 (0x0000ffff97fa0000)
    libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff97d70000)
    libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff97cd0000)
    libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff97ca0000)
    libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff97af0000)
    /lib/ld-linux-aarch64.so.1 (0x0000ffff98504000)
    linux-vdso.so.1 (0x0000ffff953b3000)
    libgstreamer-1.0.so.0 => /lib/aarch64-linux-gnu/libgstreamer-1.0.so.0 (0x0000ffff951d0000)
    libgobject-2.0.so.0 => /lib/aarch64-linux-gnu/libgobject-2.0.so.0 (0x0000ffff95150000)
    libglib-2.0.so.0 => /lib/aarch64-linux-gnu/libglib-2.0.so.0 (0x0000ffff95000000)
    libgstbase-1.0.so.0 => /lib/aarch64-linux-gnu/libgstbase-1.0.so.0 (0x0000ffff94f70000)
    libMicrosoft.CognitiveServices.Speech.core.so => /root/.local/share/virtualenvs/./AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so (0x0000ffff94a70000)
    libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffff94a50000)
    libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff94820000)
    libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff947f0000)
    libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff94640000)
    /lib/ld-linux-aarch64.so.1 (0x0000ffff9537a000)
    libgmodule-2.0.so.0 => /lib/aarch64-linux-gnu/libgmodule-2.0.so.0 (0x0000ffff94620000)
    libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff94580000)
    libunwind.so.8 => /lib/aarch64-linux-gnu/libunwind.so.8 (0x0000ffff94540000)
    libdw.so.1 => /lib/aarch64-linux-gnu/libdw.so.1 (0x0000ffff94480000)
    libffi.so.8 => /lib/aarch64-linux-gnu/libffi.so.8 (0x0000ffff94460000)
    libpcre.so.3 => /lib/aarch64-linux-gnu/libpcre.so.3 (0x0000ffff943e0000)
    libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000ffff943c0000)
    libuuid.so.1 => /lib/aarch64-linux-gnu/libuuid.so.1 (0x0000ffff943a0000)
    liblzma.so.5 => /lib/aarch64-linux-gnu/liblzma.so.5 (0x0000ffff94360000)
    libelf.so.1 => /lib/aarch64-linux-gnu/libelf.so.1 (0x0000ffff94330000)
    libz.so.1 => /lib/aarch64-linux-gnu/libz.so.1 (0x0000ffff94300000)
    libbz2.so.1.0 => /lib/aarch64-linux-gnu/libbz2.so.1.0 (0x0000ffff942d0000)
    linux-vdso.so.1 (0x0000ffffa2f47000)
    libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffffa2c70000)
    libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffffa2bd0000)
    libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffffa2ba0000)
    libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffffa29f0000)
    /lib/ld-linux-aarch64.so.1 (0x0000ffffa2f0e000)
    libMicrosoft.CognitiveServices.Speech.core.so => /root/.local/share/virtualenvs/./AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so (0x0000ffffa24f0000)
    libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffffa24d0000)
    libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000ffffa24b0000)
    libuuid.so.1 => /lib/aarch64-linux-gnu/libuuid.so.1 (0x0000ffffa2490000)
    linux-vdso.so.1 (0x0000ffff8141c000)
    libMicrosoft.CognitiveServices.Speech.core.so => /root/.local/share/virtualenvs/./AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so (0x0000ffff80e80000)
    libuuid.so.1 => /lib/aarch64-linux-gnu/libuuid.so.1 (0x0000ffff80e50000)
    libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff80c20000)
    libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff80bf0000)
    libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff80a40000)
    /lib/ld-linux-aarch64.so.1 (0x0000ffff813e3000)
    libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000ffff80a20000)
    libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffff80a00000)
    libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff80960000)
    linux-vdso.so.1 (0x0000ffff9231b000)
    libasound.so.2 => /lib/aarch64-linux-gnu/libasound.so.2 (0x0000ffff92190000)
    libMicrosoft.CognitiveServices.Speech.core.so => /root/.local/share/virtualenvs/./AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so (0x0000ffff91c90000)
    libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000ffff91c70000)
    libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffff91c50000)
    libstdc++.so.6 => /lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000ffff91a20000)
    libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000ffff919f0000)
    libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff91840000)
    /lib/ld-linux-aarch64.so.1 (0x0000ffff922e2000)
    libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000ffff917a0000)
    libuuid.so.1 => /lib/aarch64-linux-gnu/libuuid.so.1 (0x0000ffff91780000)
dzianisv commented 1 year ago

Looks like openssl is dynamically loaded. This helped

wget "http://ports.ubuntu.com/ubuntu-ports/pool/main/o/openssl/libssl1.1_1.1.1-1ubuntu2.1~18.04.23_arm64.deb" 
dpkg -i ./libssl1.1_1.1.1-1ubuntu2.1~18.04.23_arm64.deb

Thanks

yulin-li commented 1 year ago

Yes, it'd dynamically loaded. Glad to see the issue resolved.

dzianisv commented 1 year ago

New issue

  File "/opt/AssistantPlato/src/main.py", line 211, in listen_for_activation_keyword
    keyword_recognizer = speechsdk.KeywordRecognizer()                                                                                                                       File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/speech.py", line 2737, in __init__
    _call_hr_fn(fn=_sdk_lib.recognizer_create_keyword_recognizer_from_audio_config, *[ctypes.byref(handle), audio_handle])                                                   File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/interop.py", line 62, in _call_hr_fn
    _raise_if_failed(hr)                                                                                                                                                     File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/interop.py", line 55, in _raise_if_failed
    __try_get_error(_spx_handle(hr))                                                                                                                                         File "/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/interop.py", line 50, in __try_get_error
    raise RuntimeError(message)                                                                                                                                            RuntimeError: Exception with error code:                                             
[CALL STACK BEGIN]   
[CALL STACK BEGIN]

/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.extension.audio.sys.so(+0xea74) [0xffff870fea74]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1d9288) [0xffff87ca9288]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0xf8f80) [0xffff87bc8f80]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1881bc) [0xffff87c581bc]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0xeb7a8) [0xffff87bbb7a8]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1d9288) [0xffff87ca9288]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0xf8f80) [0xffff87bc8f80]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x185b30) [0xffff87c55b30]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x189128) [0xffff87c59128]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x12a4e0) [0xffff87bfa4e0]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x12a4e0) [0xffff87bfa4e0]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x161c30) [0xffff87c31c30]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x14795c) [0xffff87c1795c]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1e6da8) [0xffff87cb6da8]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x14262c) [0xffff87c1262c]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x204e58) [0xffff87cd4e58]
/root/.local/share/virtualenvs/AssistantPlato-qjxu2X5g/lib/python3.10/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(recognizer_create_keyword_recognizer_from_audio_config+0xe8) [0xffff87b8a9f4]
[CALL STACK END]
**** List of CAPTURE Hardware Devices ****
card 0: rockchipes8316c [rockchip-es8316c], device 0: ff880000.i2s-ES8316 HiFi ES8316 HiFi-0 [ff880000.i2s-ES8316 HiFi ES8316 HiFi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: hdmisound [hdmi-sound], device 0: ff8a0000.i2s-i2s-hifi i2s-hifi-0 [ff8a0000.i2s-i2s-hifi i2s-hifi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
yulin-li commented 1 year ago

@dargilco could you (or ask the DRI) help to check the KWS issue? Thanks!

dzianisv commented 1 year ago

Any updates here?

pankopon commented 1 year ago

@dzianisv Please try the following code, I verified it works on Raspberry Pi 4 with setup:

$ lsb_release -d
Description:    Ubuntu 22.04.2 LTS

$ uname -a
Linux ubuntu 5.15.0-1032-raspi #35-Ubuntu SMP PREEMPT Wed Jun 7 16:00:54 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

$ arecord -l
**** List of CAPTURE Hardware Devices ****
card 2: USB [Jabra UC VOICE 750 MS USB], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

$ python3 --version
Python 3.10.6

$ pip3 list | grep azure
azure-cognitiveservices-speech 1.29.0

Code:

recognizer = speechsdk.KeywordRecognizer()
model = speechsdk.KeywordRecognitionModel(model_file)

result_future = recognizer.recognize_once_async(model)
print("Say something starting with the keyword...")
result = result_future.get()

if result.reason == speechsdk.ResultReason.RecognizedKeyword:
    print("Recognized keyword: {}".format(result.text))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Recognition canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        print("Error details: {}".format(cancellation_details.error_details))

stop_future = recognizer.stop_recognition_async()
print("Stopping...")
stopped = stop_future.get()

The model file used was https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/quickstart/csharp/uwp/keyword-recognizer/helloworld/Keyword/kws.table

pankopon commented 1 year ago

Closed as resolved. Please open a new issue if more support is needed.

dzianisv commented 1 year ago

My OS

Description:    Ubuntu 22.04.2 LTS
Linux orangepi4-lts 6.1.30-rockchip64 #3 SMP PREEMPT Wed May 24 16:32:53 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
**** List of CAPTURE Hardware Devices ****
card 0: rockchipes8316c [rockchip-es8316c], device 0: ff880000.i2s-ES8316 HiFi ES8316 HiFi-0 []
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: hdmisound [hdmi-sound], device 0: ff8a0000.i2s-i2s-hifi i2s-hifi-0 [ff8a0000.i2s-i2s-hifi i2s-hifi-0]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
Python 3.10.6

demo.py I used

#!/usr/bin/env python3
import azure.cognitiveservices.speech as speechsdk

recognizer = speechsdk.KeywordRecognizer()
model_file="kws.table"
model = speechsdk.KeywordRecognitionModel(model_file)

result_future = recognizer.recognize_once_async(model)
print("Say something starting with the keyword...")
result = result_future.get()

if result.reason == speechsdk.ResultReason.RecognizedKeyword:
    print("Recognized keyword: {}".format(result.text))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Recognition canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        print("Error details: {}".format(cancellation_details.error_details))

stop_future = recognizer.stop_recognition_async()
print("Stopping...")
stopped = stop_future.get()

crashed

$ python3 demo.py 

Say something starting with the keyword...
Segmentation fault
bobvanderlinden commented 1 year ago

Looks like openssl is dynamically loaded. This helped

wget "http://ports.ubuntu.com/ubuntu-ports/pool/main/o/openssl/libssl1.1_1.1.1-1ubuntu2.1~18.04.23_arm64.deb" 
dpkg -i ./libssl1.1_1.1.1-1ubuntu2.1~18.04.23_arm64.deb

Thanks for the hint! It would be nice if this was documented somewhere. Especially since this needs OpenSSL 1.1, where-as OpenSSL 3 doesn't seem to be working. OpenSSL 1.1 is EOL:

OpenSSL 1.1.1 was released on 11th September 2018, and so it will be considered EOL on 11th September 2023. It will no longer be receiving publicly available security fixes after that date.

How/where are the libMicrosoft* libraries built? Can they be built against OpenSSL 3? Is the source available somewhere to make the necessary patches?

Raciel-c commented 8 months ago

Speech synthesis canceled: Connection failed (no connection to the remote host). Internal error: 1. Error details: Failed with error: WS_OPEN_ERROR_UNDERLYING_IO_OPEN_FAILED wss://eastasia.tts.speech.microsoft.com/cognitiveservices/websocket/v1

Ali cloud servers can ping eastasia.tts.speech.microsoft.com, why the SDK to invoke an error?

Baklap4 commented 7 months ago

Linking issue: https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/2048 since all other issues are closed.