Closed TheresaSchmidt closed 5 years ago
@timobaumann As I understand the code in the Sphinx plugin, you are sending the grammar to Sphinx in a data:
URL. Is it possible that there is a length limit on data: URLs? Would it be feasible to write the grammar to a temporary file and pass a file URL instead?
Also: I have observed this bug today. When @TheresaSchmidt says "DialogOS crashing completely", she means that DialogOS becomes unresponsive and needs to be terminated via task manager.
@timobaumann As I understand the code in the Sphinx plugin, you are sending the grammar to Sphinx in a
data:
URL. Is it possible that there is a length limit on data: URLs? Would it be feasible to write the grammar to a temporary file and pass a file URL instead?
I have no insight into this but I've been thinking. Does the URL hypothesis seem plausible considering the fact that everything works just fine in Silent Mode?
Yes, in "silent mode", the Sphinx speech recognizer is not used at all.
Oh, now I get it :)
@alexanderkoller yes, that's a plausible hypothesis. Although the limit should really be very high. We should have a unit test that tests recognition with successively larger grammars to find it.
newGrammarFromJSGF is currently used with a URL
as input, but it could also be called with an InputStream
or a Reader
. In fact, the version taking a URL
first converts that.
It would be great to convert the data to one of those classes and use that rather than the current String -> data-URL -> BufferedInputStream -> InputStreamReader conversion.
If only I knew how to actually pipe that through all the layers of sphinx :-/
FTR, I also tried to rewrite DataURLHelper.encodeData
to write to a temp file and return a URL to that one but some part of sphinx tries to add a ".gram" to the URL and then (obviously) fails.
yes, that's the grammar reader in sphinx. How about naming your files ending in .gram and not saying that in the URL? :-)
I could try that but using temporary files really seem like a hack to me and I can't find where the ".gram" is added to the URL to know I actually do the right thing.
I would prefer to use a stream so we don't generate lots of temporary files and have another surface for potential problems.
It is in JSGFGrammar.grammarNameToURL
and I don't really know how to handle that. It assumes weird things (going to the basedir ot the grammar, then searching for name + gram) and the only way I can imagine that the data-approach actually works is that it fails spectacularly and the catch block then does its thing.
All in all: I don't see an easy fix without knowledge about sphinx.
test in 8428981e518ebad0ccaec0ad8a1bb1dec67a6c40 works very nice with data URLs that hold 1MB of text. (They do grow slow at 4MB, to about 1/3 of a second per urlencoding/decoding.)
don't put %
into your grammar (as in
\"Kokosmilch (9 % Fett)\"
\"Mozzarella (9 % Fett)\"
\"Quark (20 % Fett)\"
\"Sauerrahm (15 % Fett)\"
\"Schichtkäse (10 % Fett)\"
\"Schinkenwürfel (2 % Fett)\"
\"Schokolade (70 % Kakaogehalt)\"
I didn't bother check whether other odd characters also make DialogOS tell you that your grammar is bad.
Thanks!
So to summarize, do I understand the situation correctly as follows:
If this is right, then we should document this in the manual. Do you know why percent signs are a problem? Do they have special meaning in Sphinx? Are there other tokens that might plausibly cause similar problems?
I simply tested whether megabytes of content encoded in Data URLs survive. We'd have to check the size limit for Sphinx grammar parsing and construction but I'd expect it to be high. (After all, the search graph in a SLM easily has 10000s of entries at every branch and the people behind JavaCC probably know their business.)
I believe it's the percentage signs. However, @TheresaSchmidt could potentially also have stray "
or '
somewhere in there which will definitely break things. @TheresaSchmidt , you could debug this by gradually adding more and more of your intended grammar until it breaks. You'll probably see the stray signs along the way (the % immediately caught my eye).
the JSGF specification actually allows percentage signs even for rule names. This may thus be an implementation issue in Sphinx' JSGFParser. Maybe "quoting" helps to resolve it. However, I'd strongly advise against anything that isn't easily pronounceable. To get quotes as part of token, you need to quote the token and escape the quote sign like "\"so\""
-- but why would you ever want to do that??
what's allowed is determined by https://github.com/cmusphinx/sphinx4/blob/master/sphinx4-core/src/main/java/edu/cmu/sphinx/jsgf/parser/JSGFParser.java and https://github.com/cmusphinx/sphinx4/blob/master/sphinx4-core/src/main/resources/edu/cmu/sphinx/jsgf/parser/jsgf.jj . (Looking at that again, it's more likely that some quoting went wrong in the example above because it appears to deal with %.)
essentially Sphinx was trying to say "there's something wrong with the grammar at <URL>
. However, in our case the URL is a very long string that ends in /.gram
(see below).
for reference: the path we're taking is via Sphinx' configuration management. That only allows us to re-set the basepath of the URL to the grammar, as well as the actual basename of the grammar file. Sphinx automatically constructs the URL to the grammar by taking basepath/basename.gram
(see JSGFGrammar, the individual parts are strings).
the overall flow in Theresa's example is: Dialogos String variable -> DialogOS SRGS grammar representation -> string -> data-url -> string -> data-url/.gram -> inpustream (string-based) -> Sphinx JSGF grammar. No, I don't like it, it's just the only way I could make it work without changing Sphinx. The serialization from DialogOS seems to ignore the original ordering of rules in the grammar (but that shouldn't matter much.)
We simply leave the grammar name unspecified and exchange the URL against a data URL and calls its openstream. The data URL is robust against having /.gram
added to the end and otherwise returns the content (either plain as we use it, or using base64-encoding). During decoding, weird stuff happens with +
(url-encoding encodes both space and plus as plus) so we escape pluses to %2b.
@akoehn , if you want to shortcut this, you could come up with a new URL scheme (say, cache:
) that creates IDs (hashes?) of grammars (or their inputstreams) that you store in a cache. Then, upon seeing the URL with your hash (plus potentially /.gram
in the end), you return the inputstream.
Describe the bug We have created a very big grammar. In Silent Mode the Sphinx node works just fine but in speech recognition mode, our dialogue breaks down when loading the speech recognition. This leads to DialogOS crashing completely. I am attaching a minimal example with the exact same grammar which shows similar behaviour. Instead of breaking down, however, it actually shows an error message:
The
error.log
file seems (to me) to be the same for our dialogue and the attached example. For the example it starts like this:Then there's the rest of the grammar. It ends like this:
To Reproduce Please attach a minimal example dialog exposing the bug if applicable. Steps to reproduce the behavior:
No match for recognition result
for Hallo.)Expected behavior The behaviour described in step 3. should not only happen in Silent Mode but also with speech recognition.
Installation information