Uberi / speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.
https://pypi.python.org/pypi/SpeechRecognition/
BSD 3-Clause "New" or "Revised" License
8.46k stars 2.4k forks source link

microphone_recognition.py: unable to open slave, Unknown PCM cards.pcm.rear #191

Closed zhihongzeng closed 7 years ago

zhihongzeng commented 7 years ago

Working env (1/1/2017): ubuntu 16.04 LTS python 2.7.12 speech_recognition 3.5.0 pyaudio 0.2.9 build-in micro in the laptop is responsive to sound.

Code: import speech_recognition as sr r = sr.Recognizer() with sr.Microphone() as source: print("Say something!") audio = r.listen(source)

Outputs: ALSA lib pcm_dsnoop.c:606:(snd_pcm_dsnoop_open) unable to open slave ALSA lib pcm_dmix.c:1029:(snd_pcm_dmix_open) unable to open slave ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side ALSA lib pcm_dmix.c:1029:(snd_pcm_dmix_open) unable to open slave Say something!

The program does not response to speech

Any suggestions would be appreciated.

zhihongzeng commented 7 years ago

Here is my additional test. I tested on microphone_recognition.py. The result showed that only google speech recognition works well. The outputs: $ python speechrec.py ALSA lib pcm_dsnoop.c:606:(snd_pcm_dsnoop_open) unable to open slave ALSA lib pcm_dmix.c:1029:(snd_pcm_dmix_open) unable to open slave ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2266:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side ALSA lib pcm_dmix.c:1029:(snd_pcm_dmix_open) unable to open slave Say something!

Sphinx error; missing PocketSphinx module: ensure that PocketSphinx is set up correctly. Google Speech Recognition thinks you said once upon a time there was a princess baby in the car so one day I think about come to the castle in the north the door knock knock knock Could not request results from Wit.ai service; recognition request failed: Bad Request Could not request results from Microsoft Bing Voice Recognition service; recognition request failed: Access Denied Traceback (most recent call last): File "speechrec.py", line 54, in print("Houndify thinks you said " + r.recognize_houndify(audio, client_id=HOUNDIFY_CLIENT_ID, client_key=HOUNDIFY_CLIENT_KEY)) File "/usr/local/lib/python2.7/dist-packages/speech_recognition/init.py", line 871, in recognize_houndify base64.urlsafe_b64decode(client_key), File "/usr/lib/python2.7/base64.py", line 119, in urlsafe_b64decode return b64decode(s.translate(_urlsafe_decode_translation)) File "/usr/lib/python2.7/base64.py", line 78, in b64decode raise TypeError(msg) TypeError: Incorrect padding

Uberi commented 7 years ago

Hi @zhihongzeng,

For microphone_recognition.py, you need to specify API keys for all the APIs except for Google Speech.

See the library reference for more information.

I also recommend starting with the README - it contains information about how to make the others work properly.

microphone_recognition.py is only intended as a demonstration; for your own program, you would only be using one recognition backend. I recommend recognize_google if you're experimenting with this sort of thing, since it's fast and easy to get started with.

dem23step commented 7 years ago

I know it's a bit late, but I've done with that problem by calibrating energy threshold:

import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
  r.adjust_for_ambient_noise(source)  # here
  print("Say something!")
  audio = r.listen(source)
kunalchikte commented 5 years ago

I know it's a bit late, but I've done with that problem by calibrating energy threshold:

import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
  r.adjust_for_ambient_noise(source)  # here
  print("Say something!")
  audio = r.listen(source)

I have done the changes but problem still same

huogerac commented 3 years ago

For me, what have worked was:

0) I needed to install few libs in the OS (Linux Mint) in order to install the pip install SpeechRecognition Check it out in the end

1) I did run python3 -m speech_recognition then I got the Set minimum energy threshold to 532.763972486183

2) I set the the threshold in my code before the r.listen() based on the output above

    r.energy_threshold = 533
    r.dynamic_energy_threshold = True

3) I set the device_index inside the microfone testing from 0 to 7 and it worked using 7

4) The ALSA lib pcm_... warnings didn't stop showing, however, it isn't spoiling the script, I still see the warnings but it works

5) I needed to adjust the energy_threshold depends the noise in the background, otherwise it stay capturing data from the Mic without print the output. In another words, I used 533 late in the night and I needed 1932 during the day

Here my complete code:

#!/usr/bin/env python3

import speech_recognition as sr

# for index, name in enumerate(sr.Microphone.list_microphone_names()):
#     print("Microphone with name \"{1}\" found for `Microphone(device_index={0})`".format(index, name))

r = sr.Recognizer()
with sr.Microphone(device_index=7) as source:

    r.adjust_for_ambient_noise(source)
    r.energy_threshold = 1932
    r.dynamic_energy_threshold = True
    r.pause_threshold=1.2

    print("Say something!")
    audio = r.listen(source)

    try:
        text = r.recognize_google(audio, language="pt-br")
        print(text)
    except sr.UnknownValueError:
        print("Could not understand audio")
    except sr.RequestError as e:
        print("Could not request results {0}".format(e))

Not sure if it's necessary, but I did this in the linux mint 20 in order to make possible installing the speech_recognition

sudo apt-get install python3-pyaudio
sudo apt-get install libasound-dev portaudio19-dev libportaudio2 libportaudiocpp0
sudo apt-get install ffmpeg libav-tools
pip install pyaudio

links: https://github.com/Uberi/speech_recognition/issues/100 https://github.com/Uberi/speech_recognition#pyaudio-for-microphone-users

bashirii commented 4 months ago

I know it's a bit late, but I've done with that problem by calibrating energy threshold:

import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
  r.adjust_for_ambient_noise(source)  # here
  print("Say something!")
  audio = r.listen(source)

Thanks, The program is able to recognize and print exact user query, though ALSA error is till there