mozilla / DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Mozilla Public License 2.0
24.81k stars 3.93k forks source link

Create a utility that continually receives audio files and does STT on them #640

Closed reuben closed 5 years ago

kdavis-mozilla commented 7 years ago

@reuben Could you give details on what you want to do here? Then we can prioritize properly.

elpimous commented 6 years ago

anything like this ?

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import absolute_import, division, print_function
import sys
import scipy.io.wavfile as wav
from deepspeech.model import Model
import timeit
import os

model2 = '/home/nvidia/DeepSpeech/data/deepspeech_material/exported_model/output_graph.pb'
micro2 = '/home/nvidia/DeepSpeech/data/deepspeech_material/test/record.'

ds = Model(model2, 26, 9) #model link, cepstrum, context

REC = 'rec --encoding signed-integer --bits 16 --channels 1 --rate 16000 alfred.wav silence 1 0.1 1% 1 1.5 1%'

p = ''

print('''

     °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°
     Testing Deepspeech model, with a simple rec voice detection
     press Enter to continue, or other input to leave !
     °°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°

      ''')
while p == '':

      os.system(REC)
      b='alfred.wav'
      fs, audio = wav.read(b)
      a = str(ds.stt(audio, fs))
      print(a)
      p = raw_input('\n>>')
      pass

print('\nBye')
kdavis-mozilla commented 6 years ago

@reuben ping

reuben commented 6 years ago

Something like that, yes. We ended up deciding to build a GUI for the demonstration, and I plan to merge it when we release our models so people have a tool to test them more easily.

jehoshua7 commented 6 years ago

We ended up deciding to build a GUI for the demonstration, and I plan to merge it when we release our models so people have a tool to test them more easily.

@reuben - can you please advise what the status of this is ?

..later - Just saw this, so I assume that is the GUI demo - https://github.com/reuben/ds-gui-demo

reuben commented 6 years ago

Yes, I ported the demo to work with our Python package and put it on that repo. Right now it depends on #1164 so you should install the deepspeech package from TaskCluster, not PyPI. I'm gonna update the README there to reflect that.

jehoshua7 commented 6 years ago

@reuben - thanks; I was testing it last night. :)

lissyx commented 5 years ago

Closing because that was done.

lock[bot] commented 5 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.