Open iplayfast opened 11 years ago
I would think using a local speech recognition to scan for an activator word than then starts sending to google. You only have to get a local recognition to understand a few words. and/or program activator words. I play fasts idea could still be used for awhile after a recognized command for context commands.
Yes I think this is the best solution, but we have to get the local recognition working first.
On Mon, Apr 15, 2013 at 10:52 AM, jpaytoncfd notifications@github.comwrote:
I would think using a local speech recognition to scan for an activator word than then starts sending to google. You only have to get a local recognition to understand a few words. and/or program activator words. I play fasts idea could still be used for awhile after a recognized command for context commands.
— Reply to this email directly or view it on GitHubhttps://github.com/JamezQ/Palaver/issues/26#issuecomment-16400633 .
is there open source, local speech recognition software available?
CMUSphinx
http://cmusphinx.sourceforge.net
You can actually successfully use it for all the recognition, not just activation. It should work with the same accuracy as Google for simple command and control tasks.
I have tried to set that up before and I have not succeeded thus far. I would love to hook up to sphinx, but I first must learn how to set it up properly.
You can get a realtime help on #cmusphinx channel on freenode or you can ask for help here. You probably just want to revisit it.
To try pocketsphinx, you can just install latest 0.8 package from the repository and run pocketsphinx_continuous command.
You must have thought of this but just in case you haven't. Record sound for upto 15 seconds and stop if quiet for more then 2 seconds (early stop) As you are recording Send 1 second, Send 1,2 seconds send 1,2,3 seconds ... Send 1.2.3... 15 seconds or early stop. While all this is going on receive results from recognizer (google or whatever) tally all the results that are most common, (words that are the same the most) and that is the final output. repeat.
It's brute force and uses a lot of bandwidth, but for those times when you have to have it, it might work.