dialogos-project / dialogos

The DialogOS dialog system.
https://www.dialogos.app
GNU General Public License v3.0
21 stars 8 forks source link

Patterns in input and ASR nodes #110

Closed alexanderkoller closed 6 years ago

alexanderkoller commented 6 years ago

The documentation seems to promise that any Input and Speech Recognizer node can have arbitrary patterns in their "input words" or "input patterns" list.

This seems to work okay for simple regular expressions, such as a disjunction "zwei | zwo", but if the pattern contains slashes or brackets (e.g. for a variable-assignment pattern), the recognizer complains about unexpected symbols (see below).

I'm not sure what the intended behavior is here - does anyone know? Either way, we should either make the code conform with the documentation or clarify the documentation to conform to the code. We should also keep in mind that there may be legacy dialog files out there that rely on a certain behavior.

To reproduce, TryRecognition in the ASR node of the attached dialog. This yields the following exception.

Error:
class com.clt.script.parser.ParseException
Unexpected symbol '/' at line 3, position 1:
/

Details:
com.clt.srgf.Grammar.create(Grammar.java:1770)
com.clt.srgf.Grammar.create(Grammar.java:1758)
com.clt.diamant.graph.nodes.AbstractInputNode.compileGrammar(AbstractInputNode.java:1018)
com.clt.diamant.graph.nodes.AbstractInputNode.recognizeExec(AbstractInputNode.java:619)
com.clt.diamant.graph.nodes.AbstractInputNode$12$1.run(AbstractInputNode.java:555)
java.lang.Thread.run(Thread.java:748)

Geburtstagskalender.dos.zip

alexanderkoller commented 6 years ago

PS: Such complex patterns seem to work in ASR nodes that use an explicit grammar. The attached dialog demonstrates the behavior only with a DirectGrammar.

alexanderkoller commented 6 years ago

I seem to remember that the old DialogOS behaved differently in its pattern-matching behavior for DirectGrammars and explicit grammars. Notice that the pattern list is called "input words" in one case and switches to "input patterns" in the other. Maybe there was a difference between the two situations all along. In any case, this would need to be documented.

timobaumann commented 6 years ago

What's the intended behaviour for the dos-file you attached? Should the recognizer recognize anything (i.e., any name) within parentheses in /Wann hat (.\*) Geburtstag/=(name) and then assign the value in parentheses to a variable name? (probably not, as you did not define a variable.) How would the recognizer know what to accept in parentheses?

I really think matching makes sense only for externally provided grammars (i.e., from expression or global). DirectGrammars need to specify the rules themselves, not just the matching.

I wonder if something like "Wann hat (Alexander|Arne|Timo) Geburtstag?" works in DirectGrammar?

The "bug" tag refers to the fact that the exception is obscure, right? I've changed it to enhancement.

alexanderkoller commented 6 years ago

Yes, "Wann hat (Alexander|Arne|Timo) Geburtstag?" works in DirectGrammar. But as far as I can tell, more complex regexes do not work.

Part of the problem is probably what you say: How would the recognizer know what words can come at ".*"? But in addition, you also can't say /Wann hat (Alexander|Arne|Timo) Geburtstag/ = name because Slash-Patterns are not allowed in this context.

I have clarified the documentation on the grammars page, and will now close this issue.

timobaumann commented 6 years ago

@alexanderkoller : you have to put grammar rules (which, to make things worse, have to be self-sufficient, not relying on other rules). Not regexes. I actually wasn't aware of the /regexp/ = variable functionality -- is that documented anywhere?

alexanderkoller commented 6 years ago

@timobaumann What? Where would you put the grammar rules? This is surprising - can you show me a dialog?

Yes, the regex pattern is documented in the manual: https://github.com/dialogos-project/dialogos/wiki/Patterns

As far as I can tell, the "input patterns" field in both fixed and dynamic grammars support arbitrary patterns from that page. The "input words" or "keywords" or whatever field for DirectGrammar does not.

timobaumann commented 6 years ago

great, I should eventually start reading this documentation :-)

alexanderkoller commented 6 years ago

Lots of good stuff in there. :)