Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.82k stars 1.83k forks source link

Trying to use Python SDK azure-cognitiveservices-speech 1.34.0 with webm container #2551

Closed neeagl closed 1 month ago

neeagl commented 1 month ago

Installed the latest version of gstreamer on Ubuntu 20.04 but still says SPXERR_GSTREAMER_NOT_FOUND_ERROR

We're using Github Actions for deployment.

run: sudo apt-get install -y libgstreamer1.0-0 gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly gstreamer1.0-libav gstreamer1.0-doc gstreamer1.0-tools Also tried installing the older gstreamer version 1.14.4 but that's not available with Ubuntu 20.04

Please suggest a fix or a possible workaround.

pankopon commented 1 month ago

Hi, try the following with the official Ubuntu 20.04 docker image. First download it and run, like

docker pull ubuntu:20.04
docker run -it --workdir /csspeech --volume "/mnt/e/tmp:/csspeech" ubuntu:20.04 bash

Then do the following commands in the container instance, '>' denotes a prompt for your input. Replace YOUR_KEY and YOUR_REGION with correct values.

> apt update
> apt install -y ca-certificates git libasound2 libssl1.1 lsb-release python3-pip
> lsb_release -d
Description:    Ubuntu 20.04.6 LTS
> python3 --version
Python 3.8.10
> python3 -m pip install azure-cognitiveservices-speech scipy
...
Successfully installed azure-cognitiveservices-speech-1.40.0 numpy-1.24.4 scipy-1.10.1
> DEBIAN_FRONTEND=noninteractive apt install -y libgstreamer1.0-0 gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly
> dpkg -l | grep gstreamer
ii  gstreamer1.0-plugins-bad:amd64       1.16.3-0ubuntu1.1
ii  gstreamer1.0-plugins-base:amd64      1.16.3-0ubuntu1.3
ii  gstreamer1.0-plugins-good:amd64      1.16.3-0ubuntu1.2
ii  gstreamer1.0-plugins-ugly:amd64      1.16.2-2build1
ii  gstreamer1.0-x:amd64                 1.16.3-0ubuntu1.3
ii  libgstreamer-plugins-bad1.0-0:amd64  1.16.3-0ubuntu1.1
ii  libgstreamer-plugins-base1.0-0:amd64 1.16.3-0ubuntu1.3
ii  libgstreamer-plugins-good1.0-0:amd64 1.16.3-0ubuntu1.2
ii  libgstreamer1.0-0:amd64              1.16.3-0ubuntu1.1
> git clone https://github.com/Azure-Samples/cognitive-services-speech-sdk
> cd cognitive-services-speech-sdk/samples/python/console
> sed -i -e 's/YourSubscriptionKey/YOUR_KEY/g' -e 's/YourServiceRegion/YOUR_REGION/g' speech_sample.py
> ln -s ../../csharp/sharedcontent/console/whatstheweatherlike.mp3 .
> python3 main.py
select sample module, Ctrl-D to abort
0: speech_sample
        Speech recognition samples for the Microsoft Cognitive Services Speech SDK
...
> 0
select sample function, Ctrl-D to abort
...
3: speech_recognize_once_compressed_input
        performs one-shot speech recognition with compressed input from an audio file
...
> 3
You selected: <function speech_recognize_once_compressed_input at 0x7f6a8e3169d0>
Recognized: What's the weather like?

Do you get the same output?

neeagl commented 1 month ago

Worked all fine! realized that the issue was with my deployment. thanks for the help!