Uberi / speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.
https://pypi.python.org/pypi/SpeechRecognition/
BSD 3-Clause "New" or "Revised" License
8.3k stars 2.39k forks source link

Pocketsphinx. Low productivity compared to pocketsphinx_continuous. #208

Open jumper047 opened 7 years ago

jumper047 commented 7 years ago

Hello. I need a quick and easy offline speech recognition on OrangePiPC (single-board computer, power-like raspberry pi 2/3) When using pocketsphinx_continuous recognition phrase occurs in 3-5 seconds when using .recognize_sphinx - 20-25 seconds. Did I do something wrong? Is it possible to improve the results of comparable pocketsphinx_continuous? Unfortunately, I'm not very good at programming, and can not figure out what was going on in the home :(.

Uberi commented 7 years ago

Hi @jumper047,

Can you post your code, as well as the system information asked for in the issue template?

jumper047 commented 7 years ago

Ok, i'll do it on week. If i remember correctly, i just got code from your example(background_listening.py) and changed google to sphinx. But seems like i understand, what's going on. Pocketsphinx initialized every time, when i call .recognize_sphinx. May be i doing it wrong way, and there is a method to initialize sphinx once?

jumper047 commented 7 years ago

Steps to reproduce

  1. Launch this code
    
    import speech_recognition as sr

obtain audio from the microphone

r = sr.Recognizer() while True: with sr.Microphone() as source: print("Say something!") audio = r.listen(source)

recognize speech using Sphinx

try:
    print("Sphinx thinks you said " + r.recognize_sphinx(audio))
except sr.UnknownValueError:
    print("Sphinx could not understand audio")
except sr.RequestError as e:
print("Sphinx error; {0}".format(e))
Every time recognition results appears in 20-25 seconds

Now try:

pocketsphinx_continuous -inmic yes


Result appears in 3-5 seconds

Expected behaviour
------------------
I want to initialize pocketsphinx engine once, and after that just recognize phrases

Actual behaviour
----------------
Pocketsphinx engine initializes every time, when recognize_sphinx called

System information
------------------

My **system** is Armbian Jessie

My **Python version** is 3.4.2

My **Pip version** is 9.0.1

My **SpeechRecognition library version** is 3.6.0

My **PyAudio library version** is 0.2.10

I **installed PocketSphinx from** source code.
jumper047 commented 7 years ago

Code below works fine for me. I got init and recognition parts as separate functions and copied them to my code.

 def pocketsphinxInit(self):
        """Initialize pocketsphinx stt engine"""
        self.log("Pocketsphinx init")
        language_directory = "/home/hass/sphinx_data/model"
        acoustic_parameters_directory = os.path.join(
            language_directory, "acoustic-model")
        language_model_file = os.path.join(
            language_directory, "language-model.lm.bin")
        phoneme_dictionary_file = os.path.join(
            language_directory, "pronounciation-dictionary.dict")
        config = pocketsphinx.Decoder.default_config()
        config.set_string("-hmm", acoustic_parameters_directory)
        config.set_string("-lm", language_model_file)
        config.set_string("-dict", phoneme_dictionary_file)
        config.set_string("-logfn", os.devnull)
        self.sphinx_decoder = pocketsphinx.Decoder(config)
        self.log("Pocketsphinx init done")

def listener(self):
        """Speech listening and recognition loop"""
        self.listening = True
        self.log('Listener started')
        while self.listening:
            try:
                with self.mic as source:
                    print("Waiting name")
                    audio = self.rec.listen(source)
                    print("Something recorded")
            except OSError:
                print("Exception caught")
                self.listening = False

            data = audio.get_raw_data(convert_rate=16000, convert_width=2)
            seconds = len(data) / 32000

            # Hotword detection
            self.snowboy_hotword_decoder.Reset()
            yuki = self.snowboy_hotword_decoder.RunDetection(data)
            if yuki > 0 and self.get_state("input_boolean.yuki_listener") == "on":
                print("Yuki recognized")
                self.turn_on("script.low_beep")
            else:
                continue

            # If sample is less then two seconds, try to use snowboy"
            if True:  # seconds < 1.9:
                print("Quick actions time!")
                # Quick actions
                self.snowboy_command_decoder.Reset()
                command = self.snowboy_command_decoder.RunDetection(data)
                if command == 2:
                    print("Switching lights")
                    self.turn_on("script.lustre_toggle")
                    continue
                elif command == 1:
                    print("Switching lamp")
                    self.toggle("light.table_lamp")
                    continue

            audio_bytes = base64.b64encode(data)
            audio_string = audio_bytes.decode('utf-8')
            self.fire_event("VOICE_RECORDED", data=audio_string)

             # Recognize extended command with pocketsphinx
             data = data[30000:]  # This works fine only for name "Yuki"
             print("end data conversion, start uttering")
             self.sphinx_decoder.start_utt()
             print("begin decoding")
             self.sphinx_decoder.process_raw(data, False, True)
             print("decoding ended, ending uttering")
             self.sphinx_decoder.end_utt()
             print("uttering ended")
             hypothesis = self.sphinx_decoder.hyp()
             print("Sended to brain")
             if hypothesis is not None:
                 print(hypothesis.hypstr)
                 self.get_app("brain").query(hypothesis.hypstr)
embie27 commented 6 years ago

Is this issue going to be fixed some time? Seems quite important.