cmusphinx / sphinx4

Pure Java speech recognition library
cmusphinx.sourceforge.net
Other
1.4k stars 586 forks source link

JSGF grammar probabilities #79

Closed rmalav15 closed 6 years ago

rmalav15 commented 6 years ago

Hey,

I am trying to understand the implementation of Sphinx. I am having hard time figuring out how grammars are implemented.

I understand that a grammar graph is created from the JSGF(or grXML) grammars. I was hoping to find a function which given a string, provide its probability (not bool, since we can also provides weights in grammar, correct if I am wrong) of matching the grammar. I am expecting this because similar thing is done "in general" for language models, and according to my best understanding grammars can be interpreted as "LM which is based on a grammar" (more context driven LM). Can you please point me to such implementation (if there is) in code base?.

In File JSGFGrammar.java, the implementation details (in initial comments) says:

All internal probabilities are maintained in LogMath log base.

I am not able to understand what internal probabilities and log base are referred here?

Please let me know the correct implementation details, if my above assumptions are wrong.

Much Thanks, Ram (Newbie in Domain)

rmalav15 commented 6 years ago

Ok,

So this is my current understanding.

After getting the javax.speech.recognition.RuleGrammar instance using edu.cmu.sphinx.jsapi.JSGFGrammar class, we can run:

public RuleParse parse(String text, String ruleName)

link, which will give us javax.speech.recognition.RuleParse object.

The RuleParse object will have the all the details about matching the string with grammar in form:

RuleParse(<command> =                    // Match <command>

   RuleSequence(                          //  by a sequence of 3 entities

     RuleParse(<action> =                 // First match <action>

       RuleAlternatives(                  // One of a set of alternatives

         RuleTag(                         // matching the tagged 

           RuleToken("close"), "CL")))    //   token "close"

     RuleParse(<object> =                 // Now match <object>

       RuleSequence(                      //   by a sequence of 2 entities

         RuleSequence(                    // RuleCount becomes RuleSequence

           RuleParse(<this_that_etc> =    // Match <this_that_etc>

             RuleAlternatives(            // One of a set of alternatives

               RuleToken("that"))))       //   is the token "that"

         RuleAlternatives(                // Match "window | door"

           RuleToken("door"))))           //   as the token "door"

     RuleSequence(                        // RuleCount becomes RuleSequence

       RuleParse(<polite> =               // Now match <polite>

         RuleAlternatives(                //   by 1 of 2 alternatives

           RuleToken("please"))))         // The token "please"
   )
 ).

So my question is, what will be the behaviour when weights are used with JSGF. This result feels like a bool (whether string matched the grammar or now). So how the weights provided in JSGF are used with final docoder? Please let me know, it will be huge help.

nshmyrev commented 6 years ago

This is a question for a forum, not really a software issue.

rmalav15 commented 6 years ago

I have posted it on forum here https://sourceforge.net/p/cmusphinx/discussion/sphinx4/thread/209345e3/