dialogos-project / dialogos

The DialogOS dialog system.
https://www.dialogos.app
GNU General Public License v3.0
21 stars 8 forks source link

Crucial for our demo: grammar too big? #173

Closed TheresaSchmidt closed 5 years ago

TheresaSchmidt commented 5 years ago

Describe the bug We have created a very big grammar. In Silent Mode the Sphinx node works just fine but in speech recognition mode, our dialogue breaks down when loading the speech recognition. This leads to DialogOS crashing completely. I am attaching a minimal example with the exact same grammar which shows similar behaviour. Instead of breaking down, however, it actually shows an error message: huge_grammar_error

The error.log file seems (to me) to be the same for our dialogue and the attached example. For the example it starts like this:

WARNING: Unexpected XML protocol element "messages"
WARNING: Unexpected XML protocol element "version"
ExtensibleDictionary
Fehler im Knoten
   anleitung_zutaten
in huge_grammar.dos
java.lang.RuntimeException: Allocation of search manager resources failed
    at edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager.allocate(SimpleBreadthFirstSearchManager.java:651)
    at edu.cmu.sphinx.decoder.AbstractDecoder.allocate(AbstractDecoder.java:103)
    at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:164)
    at edu.cmu.lti.dialogos.sphinx.client.ConfigurableSpeechRecognizer.<init>(ConfigurableSpeechRecognizer.java:25)
    at edu.cmu.lti.dialogos.sphinx.client.SphinxContext.getRecognizer(SphinxContext.java:84)
    at edu.cmu.lti.dialogos.sphinx.client.Sphinx.startImpl(Sphinx.java:43)
    at com.clt.speech.recognition.AbstractRecognizer.startLiveRecognition(AbstractRecognizer.java:136)
    at com.clt.speech.recognition.AbstractRecognizer.startLiveRecognition(AbstractRecognizer.java:111)
    at edu.cmu.lti.dialogos.sphinx.plugin.SphinxRecognitionExecutor.lambda$start$0(SphinxRecognitionExecutor.java:43)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: bad base grammar URL data:#JSGF V1.0 UTF-8;

grammar null;

<next> = 
   was ist der nächste Schritt
 | nächster Schritt
 | weiter
;

Then there's the rest of the grammar. It ends like this:

/.gram
    at edu.cmu.sphinx.jsgf.JSGFGrammar.commitChanges(JSGFGrammar.java:274)
    at edu.cmu.sphinx.jsgf.JSGFBaseGrammar.createGrammar(JSGFBaseGrammar.java:293)
    at edu.cmu.sphinx.linguist.language.grammar.Grammar.allocate(Grammar.java:112)
    at edu.cmu.sphinx.linguist.dflat.DynamicFlatLinguist.allocate(DynamicFlatLinguist.java:189)
    at edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager.allocate(SimpleBreadthFirstSearchManager.java:647)
    ... 12 more

To Reproduce Please attach a minimal example dialog exposing the bug if applicable. Steps to reproduce the behavior:

  1. Run this dialogue: huge_grammar.zip
  2. The error appears.
  3. (Run the dialogue in Silent Mode. There will be no error. For example, the grammar recognizes Einkaufszettel but reports No match for recognition result for Hallo.)

Expected behavior The behaviour described in step 3. should not only happen in Silent Mode but also with speech recognition.

Installation information

alexanderkoller commented 5 years ago

@timobaumann As I understand the code in the Sphinx plugin, you are sending the grammar to Sphinx in a data: URL. Is it possible that there is a length limit on data: URLs? Would it be feasible to write the grammar to a temporary file and pass a file URL instead?

alexanderkoller commented 5 years ago

Also: I have observed this bug today. When @TheresaSchmidt says "DialogOS crashing completely", she means that DialogOS becomes unresponsive and needs to be terminated via task manager.

TheresaSchmidt commented 5 years ago

@timobaumann As I understand the code in the Sphinx plugin, you are sending the grammar to Sphinx in a data: URL. Is it possible that there is a length limit on data: URLs? Would it be feasible to write the grammar to a temporary file and pass a file URL instead?

I have no insight into this but I've been thinking. Does the URL hypothesis seem plausible considering the fact that everything works just fine in Silent Mode?

alexanderkoller commented 5 years ago

Yes, in "silent mode", the Sphinx speech recognizer is not used at all.

TheresaSchmidt commented 5 years ago

Oh, now I get it :)

timobaumann commented 5 years ago

@alexanderkoller yes, that's a plausible hypothesis. Although the limit should really be very high. We should have a unit test that tests recognition with successively larger grammars to find it.

akoehn commented 5 years ago

newGrammarFromJSGF is currently used with a URL as input, but it could also be called with an InputStream or a Reader. In fact, the version taking a URL first converts that. It would be great to convert the data to one of those classes and use that rather than the current String -> data-URL -> BufferedInputStream -> InputStreamReader conversion.

If only I knew how to actually pipe that through all the layers of sphinx :-/

FTR, I also tried to rewrite DataURLHelper.encodeData to write to a temp file and return a URL to that one but some part of sphinx tries to add a ".gram" to the URL and then (obviously) fails.

timobaumann commented 5 years ago

yes, that's the grammar reader in sphinx. How about naming your files ending in .gram and not saying that in the URL? :-)

akoehn commented 5 years ago

I could try that but using temporary files really seem like a hack to me and I can't find where the ".gram" is added to the URL to know I actually do the right thing.

I would prefer to use a stream so we don't generate lots of temporary files and have another surface for potential problems.

akoehn commented 5 years ago

It is in JSGFGrammar.grammarNameToURL and I don't really know how to handle that. It assumes weird things (going to the basedir ot the grammar, then searching for name + gram) and the only way I can imagine that the data-approach actually works is that it fails spectacularly and the catch block then does its thing.

All in all: I don't see an easy fix without knowledge about sphinx.

timobaumann commented 5 years ago

test in 8428981e518ebad0ccaec0ad8a1bb1dec67a6c40 works very nice with data URLs that hold 1MB of text. (They do grow slow at 4MB, to about 1/3 of a second per urlencoding/decoding.)

timobaumann commented 5 years ago

don't put % into your grammar (as in

 \&quot;Kokosmilch (9 % Fett)\&quot; 
 \&quot;Mozzarella (9 % Fett)\&quot; 
 \&quot;Quark (20 % Fett)\&quot; 
 \&quot;Sauerrahm (15 % Fett)\&quot; 
 \&quot;Schichtkäse (10 % Fett)\&quot; 
 \&quot;Schinkenwürfel (2 % Fett)\&quot; 
 \&quot;Schokolade (70 % Kakaogehalt)\&quot; 

I didn't bother check whether other odd characters also make DialogOS tell you that your grammar is bad.

alexanderkoller commented 5 years ago

Thanks!

So to summarize, do I understand the situation correctly as follows:

If this is right, then we should document this in the manual. Do you know why percent signs are a problem? Do they have special meaning in Sphinx? Are there other tokens that might plausibly cause similar problems?

timobaumann commented 5 years ago