Open lsahlstr opened 6 years ago
Check this to see if it helps.
I had created a CNN network for spot word detection, inference time for wave file is ~ 70ms, and model file is 5MB(it can be small , but accuracy may lower)
Snowboy (https://github.com/kitt-ai/snowboy) could be used to trigger recording of the actual wav that is then transcribed by DeepSpeech
a possible implementation:
"Unrestricted Vocabulary Keyword Spotting using LSTM-CTC"
https://www.isca-speech.org/archive/Interspeech_2016/pdfs/0753.PDF
We would like a keyword detection enhancement to DeepSpeech, i.e, the ability to detect a key word or phrase directly from a WAV audio file. We saw "keyword spotting" in the Meeting Notes as a potential future ask, so maybe it is an enhancement on the near horizon?
We are looking for keyword search similar to Kaldi (http://kaldi-asr.org/doc/kws.html) or CMU Sphinx (https://sourceforge.net/p/cmusphinx/discussion/help/thread/9234e9d4/).