Kitt-AI / snowboy

Future versions with model training module will be maintained through a forked version here: https://github.com/seasalt-ai/snowboy
Other
3.08k stars 997 forks source link

IOError: [Errno Invalid sample rate] -9997 SOLVED #65

Closed ghost closed 7 years ago

ghost commented 7 years ago

Hi, I have read all the issues on this, but it still does not work for me, I am using a Raspberry Pi 2 and have tested 4 microphones, but only 1 works with Snowboy - the PS3 EYE.

A list of all the microphones I used as follows -

  1. - Samson UB1
  2. - Logitech QuickCam Orbit
  3. - Cheap USB microphone off Amazon

All microphones above work and record with no errors using REC temp.wav within the comand line, and also work within a python script using PyAudio open using 16000. I have updated the ~/.asoundrc with -

pcm.!default { type asym playback.pcm { type plug slave.pcm "hw:0,0" } capture.pcm { type plug slave.pcm "hw:1,0" } }

Have you got a new version out yet not on github if so please leave a link for testing ;-)

Thanks in advance

chenguoguo commented 7 years ago

Hmm that's very wired; so you were able to use a python script with PyAudio sampling at 16000, but when you tested Snowboy it gave IOError: [Errno Invalid sample rate] -9997? Snowboy uses PyAudio in the script and if PyAudio works it should work. Could you paste your python script using PyAudio?

ghost commented 7 years ago

Hi, The following code is only a part of it but it should show how I use PyAudio -

def Loop(self):
       self.detector.terminate()
        self._audio = pyaudio.PyAudio()
        RATE = 16000
        CHUNK = 1024
        LISTEN_TIME = 10
        didDetect = False
        frames = []
        THRESHOLD = self.AmbientLevel(1.8)
        print THRESHOLD
        lastN = [THRESHOLD * 1.2 for i in range(30)]
        print "Ready................."
        subprocess.call(['aplay', '-D', 'hw:0,0', 'media/start.wav'], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        stream = self._audio.open(format=pyaudio.paInt16,channels=1,rate=RATE,input=True,frames_per_buffer=CHUNK)
        for i in range(0, RATE / CHUNK * LISTEN_TIME):
            data = stream.read(CHUNK)
            frames.append(data)
            average = sum(lastN) / float(len(lastN))
            volumelevel = self.getVolumeLevel(data)
            lastN.pop(0)
            lastN.append(volumelevel)
            if average < THRESHOLD:
                didDetect = True
                break
        if didDetect:
            with tempfile.SpooledTemporaryFile(mode='w+b') as f:
                wav_fp = wave.open(f, 'wb')
                wav_fp.setnchannels(1)
                wav_fp.setsampwidth(pyaudio.get_sample_size(pyaudio.paInt16))
                wav_fp.setframerate(RATE)
                wav_fp.writeframes(''.join(frames))
                wav_fp.close()
                f.seek(0)
                stream.stop_stream()
                stream.close()
                try:
                    print "heard"
                    subprocess.call(['aplay', '-D', 'hw:0,0', 'media/stop.wav'], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
                except:
                    print "Not Heard"
        stream.stop_stream()
        stream.close()
        self.Loop()
chenguoguo commented 7 years ago

So it seems to me that the noticeable difference is the callback function. In Snowboy we used the callback function to record audio, see the following part in the Snowboy script:

        self.stream_in = self.audio.open(
            input=True, output=False,
            format=self.audio.get_format_from_width(
                self.detector.BitsPerSample() / 8),
            channels=self.detector.NumChannels(),
            rate=self.detector.SampleRate(),
            frames_per_buffer=2048,
            stream_callback=audio_callback)

Could you please play with the PyAudio initialization part in our script and see if you can figure out the issue? It has been working for us so it's a bit difficult for us to do the debugging on our side. It issue is PyAudio related. If you can figure it out, that's gonna be a HUGE help to us!

ghost commented 7 years ago

Hi, I have played with this setting and nothing changes, I add in suggestions from other posts around the net and changed the existing settings, I copied my self.audio.open in there but still had the same errors.

I am using 2016-09-23-raspbian-jessie-lite image with all things updated. I will try another Raspbian image to see what happens

chenguoguo commented 7 years ago

OK thanks a lot @ignitefoundation ! So what really bugs me is that PyAudio works with your script but does not work with the Snowboy script, and the error message clearly comes from PyAudio. So it's more likely to be a PyAudio setting issue. Perhaps by carefully adapting the Snowboy script to your working script, we will make it work eventually, and this may solve the issue that has been around for a while.

ghost commented 7 years ago

Just doing the testing with other images, I will report back soon ;-)

ghost commented 7 years ago

Got it and you would not believe what it was. There is nothing wrong with your script, it is how we are running the python script.

Just don't put sudo in front of the command -

Wrong - sudo python demo.py resources/alexa.umdl

Right - python demo.py resources/alexa.umdl

Hope this helps everyone ;-)

MAJOR UPDATE------------------------------------------

Along with the above to do, there appears to be diferent versions of the decoder script the working one is below

#!/usr/bin/env python

import collections
import pyaudio
import snowboydetect
import time
import wave
import os
import logging

logging.basicConfig()
logger = logging.getLogger("snowboy")
logger.setLevel(logging.INFO)
TOP_DIR = os.path.dirname(os.path.abspath(__file__))

RESOURCE_FILE = os.path.join(TOP_DIR, "resources/common.res")
DETECT_DING = os.path.join(TOP_DIR, "resources/ding.wav")
DETECT_DONG = os.path.join(TOP_DIR, "resources/dong.wav")

class RingBuffer(object):
    """Ring buffer to hold audio from PortAudio"""
    def __init__(self, size = 4096):
        self._buf = collections.deque(maxlen=size)

    def extend(self, data):
        """Adds data to the end of buffer"""
        self._buf.extend(data)

    def get(self):
        """Retrieves data from the beginning of buffer and clears it"""
        tmp = ''.join(self._buf)
        self._buf.clear()
        return tmp

def play_audio_file(fname=DETECT_DING):
    """Simple callback function to play a wave file. By default it plays
    a Ding sound.

    :param str fname: wave file name
    :return: None
    """
    ding_wav = wave.open(fname, 'rb')
    ding_data = ding_wav.readframes(ding_wav.getnframes())
    audio = pyaudio.PyAudio()
    stream_out = audio.open(
        format=audio.get_format_from_width(ding_wav.getsampwidth()),
        channels=ding_wav.getnchannels(),
        rate=ding_wav.getframerate(), input=False, output=True)
    stream_out.start_stream()
    stream_out.write(ding_data)
    time.sleep(0.2)
    stream_out.stop_stream()
    stream_out.close()
    audio.terminate()

class HotwordDetector(object):
    """
    Snowboy decoder to detect whether a keyword specified by `decoder_model`
    exists in a microphone input stream.

    :param decoder_model: decoder model file path, a string or a list of strings
    :param resource: resource file path.
    :param sensitivity: decoder sensitivity, a float of a list of floats.
                              The bigger the value, the more senstive the
                              decoder. If an empty list is provided, then the
                              default sensitivity in the model will be used.
    :param audio_gain: multiply input volume by this factor.
    """
    def __init__(self, decoder_model,
                 resource=RESOURCE_FILE,
                 sensitivity=[],
                 audio_gain=1):

        def audio_callback(in_data, frame_count, time_info, status):
            self.ring_buffer.extend(in_data)
            play_data = chr(0) * len(in_data)
            return play_data, pyaudio.paContinue

        tm = type(decoder_model)
        ts = type(sensitivity)
        if tm is not list:
            decoder_model = [decoder_model]
        if ts is not list:
            sensitivity = [sensitivity]
        model_str = ",".join(decoder_model)

        self.detector = snowboydetect.SnowboyDetect(
            resource_filename=resource, model_str=model_str)
        self.detector.SetAudioGain(audio_gain)
        self.num_hotwords = self.detector.NumHotwords()

        if len(decoder_model) > 1 and len(sensitivity) == 1:
            sensitivity = sensitivity*self.num_hotwords
        if len(sensitivity) != 0:
            assert self.num_hotwords == len(sensitivity), \
                "number of hotwords in decoder_model (%d) and sensitivity " \
                "(%d) does not match" % (self.num_hotwords, len(sensitivity))
        sensitivity_str = ",".join([str(t) for t in sensitivity])
        if len(sensitivity) != 0:
            self.detector.SetSensitivity(sensitivity_str);

        self.ring_buffer = RingBuffer(
            self.detector.NumChannels() * self.detector.SampleRate() * 5)
        self.audio = pyaudio.PyAudio()
        self.stream_in = self.audio.open(
            input=True, output=False,
            format=self.audio.get_format_from_width(
                self.detector.BitsPerSample() / 8),
            channels=self.detector.NumChannels(),
            rate=self.detector.SampleRate(),
            frames_per_buffer=2048,
            stream_callback=audio_callback)

    def start(self, detected_callback=play_audio_file,
              interrupt_check=lambda: False,
              sleep_time=0.03):
        """
        Start the voice detector. For every `sleep_time` second it checks the
        audio buffer for triggering keywords. If detected, then call
        corresponding function in `detected_callback`, which can be a single
        function (single model) or a list of callback functions (multiple
        models). Every loop it also calls `interrupt_check` -- if it returns
        True, then breaks from the loop and return.

        :param detected_callback: a function or list of functions. The number of
                                  items must match the number of models in
                                  `decoder_model`.
        :param interrupt_check: a function that returns True if the main loop
                                needs to stop.
        :param float sleep_time: how much time in second every loop waits.
        :return: None
        """
        if interrupt_check():
            logger.debug("detect voice return")
            return

        tc = type(detected_callback)
        if tc is not list:
            detected_callback = [detected_callback]
        if len(detected_callback) == 1 and self.num_hotwords > 1:
            detected_callback *= self.num_hotwords

        assert self.num_hotwords == len(detected_callback), \
            "Error: hotwords in your models (%d) do not match the number of " \
            "callbacks (%d)" % (self.num_hotwords, len(detected_callback))

        logger.debug("detecting...")

        while True:
            if interrupt_check():
                logger.debug("detect voice break")
                break
            data = self.ring_buffer.get()
            if len(data) == 0:
                time.sleep(sleep_time)
                continue

            ans = self.detector.RunDetection(data)
            if ans == -1:
                logger.warning("Error initializing streams or reading audio data")
            elif ans == -2:
                logger.info("Silence")
            elif ans > 0:
                message = "Keyword " + str(ans) + " detected at time: "
                message += time.strftime("%Y-%m-%d %H:%M:%S",
                                         time.localtime(time.time()))
                logger.info(message)
                callback = detected_callback[ans-1]
                if callback is not None:
                    callback()

        logger.debug("finished.")

    def terminate(self):
        """
        Terminate audio stream. Users cannot call start() again to detect.
        :return: None
        """
        self.stream_in.stop_stream()
        self.stream_in.close()
        self.audio.terminate()
chenguoguo commented 7 years ago

OMG I can't imagine that was the issue, thanks a million @ignitefoundation !!!

Regarding the script change, I was hoping the new script will work with Python3 (if you compile the Python bindings using swig/Python/Makefile with Python3), but I've only tested this on my side, and seems like no one is really using it.

Closing the issue, thanks!