wit-ai / pywit

Python library for Wit.ai
Other
1.45k stars 359 forks source link

How to stream audio to speech api while recording ? #98

Open rena1234 opened 7 years ago

rena1234 commented 7 years ago

Hello guys, I'm trying to reduce the latency time in my project, so I have been trying to send chunks of an audio while I record it, what I had tried until now was mix a code that sends a previously recorded audio by chunks:

def RecognizeSpeech(AUDIO_FILENAME,CHUNK_SIZE):
    client = Wit('MYTOKENHERE') 
    def wavIterator():
        wav = open(AUDIO_FILENAME, 'rb')
        chunk = wav.read(CHUNK_SIZE)
        while chunk:
            yield chunk
            chunk = wav.read(CHUNK_SIZE)
    resp = client.speech(wavIterator(), None,
            {'Content-Type': 'audio/wav', 'Transfer-encoding':'chunked'})

with this tutorial's code: https://indianpythonista.wordpress.com/2017/04/10/speech-recognition-using-wit-ai/ and made this frankstein:

def recReturnWavIterator(RECORD_SECONDS, CHUNK_SIZE, client):

    #--------- SETTING PARAMS FOR OUR AUDIO FILE ------------#
    FORMAT = pyaudio.paInt16    # format of wave
    CHANNELS = 2                # no. of audio channels
    RATE = 44100                # frame rate
    CHUNK = CHUNK_SIZE          # frames per audio sample
    #--------------------------------------------------------#

    # creating PyAudio object
    audio = pyaudio.PyAudio()

    # open a new stream for microphone
    # It creates a PortAudio Stream Wrapper class object
    stream = audio.open(format=FORMAT,channels=CHANNELS,
                        rate=RATE, input=True,
                        frames_per_buffer=CHUNK)
    print("Listening") 
    for i in range(int(RATE / CHUNK * RECORD_SECONDS)-1):
        # read audio stream from microphone
        data = stream.read(CHUNK)
        yield data
    print("Finished recording")
def RecognizeSpeech(CHUNK_SIZE):
    client = Wit('MYTOKENHERE')
    resp = client.speech(recReturnWavIterator(5,CHUNK_SIZE,client),None,{'Content-Type': 'audio/wav', 'Transfer-encoding': 'chunked'})
    print('Yay, got Wit.ai response: ' + str(resp))

wich does return always an empty text Yay, got Wit.ai response: {'_text': None, 'entities': {}, 'msg_id': '7ed74ba3-698a-41a9-8158-8dd3857c3808'}

Is it possible to do something like that ? How ? PS: Sorry, I am not very experienced with programming.