Closed jonathanglasmeyer closed 9 years ago
keyword spotting is not available in sphinx4 yet. Implementation from long-audio-aligner is not a true spotting with pretty bad operation properties. We hope to implement it in coming release but it is not there yet.
Ok, thanks for the information.
For my bachelor thesis i want to build a system that automatically recognizes speech from university lectures. The goal is improving recognition accuracy by using context information from lecture slides. The visible result should be a UI which presents lecture videos and lets you search for technical terms and jump to positions in the videos where they occur.
A working example can be seen at this lecture video from superlectures.com. The word error rate of technical terms is very high, however - this was the initial motivation for this project.
I am looking for the best way to do this with Sphinx4. Do you think extending the dictionary would be a feasible approach here, given keyword spotting isn't available yet?
Building language models for domain specific transcription is described here:
http://cmusphinx.sourceforge.net/wiki/tutoriallmadvanced
There is a good dissertation done by Stephen Marquard dedicated to lecture transcription with sphinx4, you might be interested to read it:
http://pubs.cs.uct.ac.za/archive/00000846/01/MPhil-Dissertation-StephenMarquard.pdf
http://trulymadlywordly.blogspot.de/2011/12/sphinx4-speech-recognition-results-for.html
keyword spotting is not available in sphinx4 yet. Implementation from long-audio-aligner is not a true spotting with pretty bad operation properties. We hope to implement it in coming release but it is not there yet.
I know this is a very old issue, (and that probably this is deprecated in favor of kaldi?) but I stumbled upon it and after using pocketsphinx I wanted to try sphinx4 as well. Could you give some directions on how to implement keyword spotting mode to sphinx4 decoder? Or in other words, what are the challenges that you faced in implementing it, that caused not to be supported? Thanks.
You'd better implement a keyword spotter for kaldi
there's some similar functionality in DialogOS which uses a phone-loop grammar to weed out garbage words. Take a look at https://github.com/dialogos-project/dialogos/blob/master/plugins/DialogOS_SphinxPlugin/src/main/java/edu/cmu/lti/dialogos/sphinx/client/SphinxLanguageSettings.java if you want to see roughly how it's implemented. I don't know, though, how well it works for actual keyword spotting (rather than the reverse of weeding out a bit of garbage).
The issue is to write the appropriate searchmanager (wordprunning one is too complex and slow) and the linguist, for best performance it has to be low-level thing.
Hi, i have been googling for the past hours to find out if Sphinx4 supports Keyword spotting right now. I found: 1) http://sourceforge.net/p/cmusphinx/code/HEAD/tree/branches/long-audio-aligner/KeyWordSpotting/ which is from 2011. 2) an implementation in pocketsphinx for android 3) an announcement concerning the S4 release schedule (http://cmusphinx.sourceforge.net/wiki/releaseschedule) that Keyword spotting is to come for Sphinx4.
Can you tell me if there is any information i am missing right now?