Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.69k stars 1.79k forks source link

azure-cognitiveservices-speech 1.37.0 won't work for PA task on Mac OSX #2347

Open dyustc opened 2 months ago

dyustc commented 2 months ago

IN ORDER TO ASSIST YOU, PLEASE PROVIDE THE FOLLOWING:

Describe the bug I use Mac OSX 14.4.1 (23E224), M2 chipset, under Python 3.9.18, I install the latest azure package. azure-cognitiveservices-speech 1.37.0. Also, I use the sample code, I copied from the Azure speech studio, except that I use my own subscription key, and region.

And it won't work. It crashes here in line 64. I make sure the wav file and text is also correct. 截屏2024-04-19 16 55 49

截屏2024-04-19 16 48 02

To Reproduce

Steps to reproduce the behavior:

  1. ...
  2. ...

Expected behavior

A clear and concise description of what you expected to happen.

Version of the Cognitive Services Speech SDK

Which version of the SDK are you using.

Platform, Operating System, and Programming Language

Additional context

here is the error msg, ~/work/ramp/CTC-Attention-Mispronunciation/egs/qa (41) » python microsoft_pronunciation_assessment.py daiyi@bogon Traceback (most recent call last): File "/Users/daiyi/work/ramp/CTC-Attention-Mispronunciation/egs/qa/microsoft_pronunciation_assessment.py", line 178, in pronunciation_assessment_continuous_from_file(audio, txt) File "/Users/daiyi/work/ramp/CTC-Attention-Mispronunciation/egs/qa/microsoft_pronunciation_assessment.py", line 62, in pronunciation_assessment_continuous_from_file pronunciation_config.enable_prosody_assessment() File "/Users/daiyi/miniconda3/envs/py39/lib/python3.9/site-packages/azure/cognitiveservices/speech/speech.py", line 3077, in enable_prosody_assessment self.__properties.set_property(PropertyId.PronunciationAssessment_EnableProsodyAssessment, "true") File "/Users/daiyi/miniconda3/envs/py39/lib/python3.9/site-packages/azure/cognitiveservices/speech/properties.py", line 29, in set_property _call_hr_fn(fn=_sdk_lib.property_bag_set_string, [self._handle, ctypes.c_int(property_id.value), None, c_value]) File "/Users/daiyi/miniconda3/envs/py39/lib/python3.9/site-packages/azure/cognitiveservices/speech/interop.py", line 62, in _call_hr_fn _raise_if_failed(hr) File "/Users/daiyi/miniconda3/envs/py39/lib/python3.9/site-packages/azure/cognitiveservices/speech/interop.py", line 55, in _raise_if_failed try_get_error(_spx_handle(hr)) File "/Users/daiyi/miniconda3/envs/py39/lib/python3.9/site-packages/azure/cognitiveservices/speech/interop.py", line 50, in try_get_error raise RuntimeError(message) RuntimeError: Exception with error code: [CALL STACK BEGIN]

3 libMicrosoft.CognitiveServices.Spee 0x0000000101c9cb34 property_bag_set_string + 408 4 libffi.8.dylib 0x000000010159804c ffi_call_SYSV + 76 5 libffi.8.dylib 0x000000010159574c ffi_call_int + 1208 6 _ctypes.cpython-39-darwin.so 0x00000001015785a0 _ctypes_callproc + 1260 7 _ctypes.cpython-39-darwin.so 0x00000001015729c8 PyCFuncPtr_call + 1148 8 python3.9 0x0000000100ecc1d0 _PyObject_Call + 164 9 python3.9 0x0000000100fb9ea4 _PyEval_EvalFrameDefault + 27244 10 python3.9 0x0000000100fb2de0 _PyEval_EvalCode + 2908 11 python3.9 0x0000000100ecc3e4 _PyFunction_Vectorcall + 220 12 python3.9 0x0000000100ecbff0 PyVectorcall_Call + 156 13 python3.9 0x0000000100fb9ea4 _PyEval_EvalFrameDefault + 27244 14 python3.9 0x0000000100ecc4a0 function_code_fastcall + 116 15 python3.9 0x0000000100fbd480 call_function + 516 16 python3.9 0x0000000100fb9b5c _PyEval_EvalFrameDefault + 26404 17 python3.9 0x0000000100ecc4a0 function_code_fastcall + 116 18 python3.9 0x0000000100fbd480 call_function + 516 19 python3.9 0x0000000100fb9b5c _PyEval_EvalFrameDefault + 26404 [CALL STACK END]

Exception with an error code: 0x5 (SPXERR_INVALID_ARG)

ralph-msft commented 2 months ago

Please provide the SDK logs: https://docs.microsoft.com/azure/cognitive-services/speech-service/how-to-use-logging

dyustc commented 2 months ago

Please provide the SDK logs: https://docs.microsoft.com/azure/cognitive-services/speech-service/how-to-use-logging

I added this line, as the log suggests speech_config.set_property(speechsdk.PropertyId.Speech_LogFilename, "/Users/daiyi/work/ramp/CTC-Attention-Mispronunciation/egs/qa/azure.log")

but the log is empty. Only the err message as my original post ` Traceback (most recent call last): File "/Users/daiyi/work/ramp/CTC-Attention-Mispronunciation/egs/qa/microsoft_pronunciation_assessment.py", line 180, in pronunciation_assessment_continuous_from_file(audio, txt) File "/Users/daiyi/work/ramp/CTC-Attention-Mispronunciation/egs/qa/microsoft_pronunciation_assessment.py", line 64, in pronunciation_assessment_continuous_from_file pronunciation_config.enable_prosody_assessment() File "/Users/daiyi/miniconda3/envs/py39/lib/python3.9/site-packages/azure/cognitiveservices/speech/speech.py", line 3077, in enable_prosody_assessment self.properties.set_property(PropertyId.PronunciationAssessment_EnableProsodyAssessment, "true") File "/Users/daiyi/miniconda3/envs/py39/lib/python3.9/site-packages/azure/cognitiveservices/speech/properties.py", line 29, in set_property _call_hr_fn(fn=_sdk_lib.property_bag_set_string, *[self._handle, ctypes.c_int(property_id.value), None, c_value]) File "/Users/daiyi/miniconda3/envs/py39/lib/python3.9/site-packages/azure/cognitiveservices/speech/interop.py", line 62, in _call_hr_fn _raise_if_failed(hr) File "/Users/daiyi/miniconda3/envs/py39/lib/python3.9/site-packages/azure/cognitiveservices/speech/interop.py", line 55, in _raise_if_failed try_get_error(_spx_handle(hr)) File "/Users/daiyi/miniconda3/envs/py39/lib/python3.9/site-packages/azure/cognitiveservices/speech/interop.py", line 50, in __try_get_error raise RuntimeError(message) RuntimeError: Exception with error code: [CALL STACK BEGIN]

3 libMicrosoft.CognitiveServices.Spee 0x0000000105d0cb34 property_bag_set_string + 408 4 libffi.8.dylib 0x0000000104c7c04c ffi_call_SYSV + 76 5 libffi.8.dylib 0x0000000104c7974c ffi_call_int + 1208 6 _ctypes.cpython-39-darwin.so 0x0000000104c5c5a0 _ctypes_callproc + 1260 7 _ctypes.cpython-39-darwin.so 0x0000000104c569c8 PyCFuncPtr_call + 1148 8 python3.9 0x00000001045b01d0 _PyObject_Call + 164 9 python3.9 0x000000010469dea4 _PyEval_EvalFrameDefault + 27244 10 python3.9 0x0000000104696de0 _PyEval_EvalCode + 2908 11 python3.9 0x00000001045b03e4 _PyFunction_Vectorcall + 220 12 python3.9 0x00000001045afff0 PyVectorcall_Call + 156 13 python3.9 0x000000010469dea4 _PyEval_EvalFrameDefault + 27244 14 python3.9 0x00000001045b04a0 function_code_fastcall + 116 15 python3.9 0x00000001046a1480 call_function + 516 16 python3.9 0x000000010469db5c _PyEval_EvalFrameDefault + 26404 17 python3.9 0x00000001045b04a0 function_code_fastcall + 116 18 python3.9 0x00000001046a1480 call_function + 516 19 python3.9 0x000000010469db5c _PyEval_EvalFrameDefault + 26404 [CALL STACK END]

Exception with an error code: 0x5 (SPXERR_INVALID_ARG) `

dyustc commented 2 months ago

Please provide the SDK logs: https://docs.microsoft.com/azure/cognitive-services/speech-service/how-to-use-logging

it seems like a library problem itself, not how I called the library. And also it crashes in the speech sdk setup stage, before calling any speech service, maybe before verfification also. so maybe this is why no log is generated.

I am running on M1 pro, sonoma 14.4.1, python 3.9, the azure-cognitiveservices-speech version is 1.37.0

ralph-msft commented 2 months ago

Have you tried using the Python sample code on your machine to rule out any potential issues in your code?

Could you also please check what the architecture of the speech shared library is? You can use e.g.

file libMicrosoft.CognitiveServices.Speech.core.dylib

or

lipo -info libMicrosoft.CognitiveServices.Speech.core.dylib
dyustc commented 2 months ago

Have you tried using the Python sample code on your machine to rule out any potential issues in your code?

Could you also please check what the architecture of the speech shared library is? You can use e.g.

file libMicrosoft.CognitiveServices.Speech.core.dylib

or

lipo -info libMicrosoft.CognitiveServices.Speech.core.dylib

I run the sample code you provided, it hint the same crash as I provided. and the output of 2 cmds is Mach-O 64-bit dynamically linked shared library arm64 or Non-fat file: libMicrosoft.CognitiveServices.Speech.core.dylib is architecture: arm64

dyustc commented 2 months ago

@ralph-msft Hi, is there anything extra I could provide? I still have this issue on my mac.

github-actions[bot] commented 1 month ago

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.