Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.68k stars 1.79k forks source link

SPXERR_FILE_OPEN_FAILED on Linux Azure Web App #2391

Open fschopper opened 1 month ago

fschopper commented 1 month ago

Describe the bug We are currently trying to deploy a Python-based Azure Web App running on Linux. One of our modules that performs Speech-to-Text using Azure's SpeechSDK is running into some issues. Running the code locally works perfectly fine - however, deploying the app results in the following error:

Traceback (most recent call last):
  File "/home/site/tmp/debugging/speech.py", line 124, in <module>
    t.run(file_name="0C_whatstheweatherlike.wav", container_name=None, blob_name=None, format='WAV')
  File "/home/site/tmp/debugging/speech.py", line 117, in run
    self.offline(file_name=file_name, container_name=container_name, blob_name=blob_name, format=format)
  File "/home/site/tmp/debugging/speech.py", line 101, in offline
    self.prepare_offline_transcription(file_name, container_name, blob_name, format)
  File "/home/site/tmp/debugging/speech.py", line 90, in prepare_offline_transcription
    self.speech_recognizer = SpeechRecognizer(speech_config=self.speech_config, audio_config=self.audio_config)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/speech.py", line 1007, in __init__
    _call_hr_fn(
  File "/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/interop.py", line 62, in _call_hr_fn
    _raise_if_failed(hr)
  File "/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/interop.py", line 55, in _raise_if_failed
    __try_get_error(_spx_handle(hr))
  File "/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/interop.py", line 50, in __try_get_error
    raise RuntimeError(message)
RuntimeError: Exception with error code: 
[CALL STACK BEGIN]

/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0xdb681) [0x7d3eb0f79681]
/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(diagnostics_log_apply_properties+0xc8) [0x7d3eb0f45238]
/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1023e3) [0x7d3eb0fa03e3]
/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x142b1e) [0x7d3eb0fe0b1e]
/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x143154) [0x7d3eb0fe1154]
/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x127934) [0x7d3eb0fc5934]
/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1eaca5) [0x7d3eb1088ca5]
/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x140fc1) [0x7d3eb0fdefc1]
/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x1fe98d) [0x7d3eb109c98d]
/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x14146f) [0x7d3eb0fdf46f]
/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(+0x21c1ac) [0x7d3eb10ba1ac]
/opt/python/3/lib/python3.12/site-packages/azure/cognitiveservices/speech/libMicrosoft.CognitiveServices.Speech.core.so(recognizer_create_speech_recognizer_from_config+0x10b) [0x7d3eb0f5e775]
/usr/lib/x86_64-linux-gnu/libffi.so.7(+0x6d1d) [0x7d3eb16eed1d]
/usr/lib/x86_64-linux-gnu/libffi.so.7(+0x6289) [0x7d3eb16ee289]
/opt/python/3/lib/python3.12/lib-dynload/_ctypes.cpython-312-x86_64-linux-gnu.so(+0x1315e) [0x7d3eb170715e]
/opt/python/3/lib/python3.12/lib-dynload/_ctypes.cpython-312-x86_64-linux-gnu.so(+0x1298c) [0x7d3eb170698c]
/opt/python/3.12.2/lib/libpython3.12.so.1.0(_PyObject_Call+0x8f) [0x7d3eb23b5e7f]
[CALL STACK END]

We are deploying the app using ZIP-deploy via an Azure Pipeline. Since this restricts files within wwwroot to read-only from what I understand, our data does not sit in /home/site/wwwroot but rather in /home/site/tmp for temporary storage. We've debugged the code as much as possible to ensure that the file exists and has the necessary permissions. Additionally, we've tried re-adjusting the folder structures, ensuring adequate permissions and even tested around with various configurations of speech_config to ingest combinations of key/host/endpoint/region to ensure that the connection isn't the issue - however, without any success so far.

At this point, we believe the deployment on Linux may be an issue. Any ideas on what could be going wrong here? Appreciate any help or pointers that I can get!

To Reproduce Steps to reproduce the behavior:

Code

    def prepare_offline_transcription(self, file_name: str = None):
        self.speech_config = SpeechConfig(subscription=self.speech_key, endpoint=self.speech_endpoint,
                                          speech_recognition_language=self.language)
        self.audio_config = AudioConfig(filename=file_name)
        self.speech_recognizer = SpeechRecognizer(speech_config=self.speech_config, audio_config=self.audio_config)

    def offline(self, file_name: str = None, container_name: str = None, blob_name: str = None, format: str = None):
        self.prepare_offline_transcription(file_name, container_name, blob_name, format)
        self.speech_recognizer.start_continuous_recognition()
        while not self.transcription_done:
            time.sleep(.5)
        self.speech_recognizer.stop_continuous_recognition()

Expected behavior Expecting the code to run as it does locally since the code, environment variables and input files are identical.

Version of the Cognitive Services Speech SDK

Platform, Operating System, and Programming Language

github-actions[bot] commented 2 weeks ago

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.