Dragonfly Failed to decode recognition

LexiconCode commented 4 years ago

This happens when the engine successfully recognizes a grammar rule, but Dragonfly couldn't find which rule was recognised. I have personally experienced this issue once. However I have at least 2 to 3 people that experience 100% with fresh installs with commands utilizing a dictation element.

Users are experiencing this with other commands but here is an example: gum bow <some words> brunt

User reports command debug statement suggests some words is being treated by dragonfly as a commands.

('gum', 0), ('bow', 0), ('some', 0), ('words', 0), ('brunt', 0)] The expected debugging should look like the following where some words recognized as free dictation.
("some", 0), ("words", 0) would be ("some", 1000000), ("words", 1000000), ('brunt', 0)] `
I verified that some words is not a command elsewhere.

Suspect debug sample

DEBUG:engine:Grammar ccr-1: received recognition [('gum', 0), ('bow', 0), ('some', 0), ('words', 0), ('brunt', 0)].
DEBUG:grammar.decode:   attempt: RepeatRule(Repeater1)
DEBUG:grammar.decode:    -- Decoding State: ' >> gum bow some words brunt'
DEBUG:grammar.decode:      attempt: Compound(u'[<original> original] [<caster_base_sequence>] [terminal <terminal>]')
DEBUG:grammar.decode:         attempt: Sequence(...)
.
.
.
DEBUG:grammar.decode:                                    attempt: Alternative(...)
DEBUG:grammar.decode:                                       attempt: Sequence(...)
DEBUG:grammar.decode:                                          attempt: Choice(..., name='capitalization')
DEBUG:grammar.decode:                                             attempt: Compound(u'cop')
DEBUG:grammar.decode:                                                attempt: Literal([u'cop'])
DEBUG:grammar.decode:                                                failure: Literal([u'cop'])
DEBUG:grammar.decode:                                             rollback: Compound(u'cop')
.
.
.

These are known good debug logs for comparison without the reported issue. The three files are debug logs with the following configurations: Vanilla is the formatting command unaltered, Modified is the version, Modified version with Test Engine.

All the logs begin with an utterance and end with execution of the command in DNS. Text-formatting-commands---modified---test-engine.txt Text-formatting-commands---modified.txt Text-formatting-commands---vanilla.txt

Other commands is being reproduced with say hello how are you should produce hello how are you instead failed to decode recognition (u'say', u'hello', u'how', u'are', u'you')

Ultimately those values are passed by Natlink to the results callbacks of grammar objects. Could there be a bug in Natlink returning the wrong values or something else along those lines.

LexiconCode commented 4 years ago

All right thanks to @tlappas we have a few more details.

A complete debug log of the command failing not truncated. debug-decode-error.txt
The issue occurs regardless if it's a mappingrule or of CCR mergerule which utilize dragonfly repetition. This should rule out dragonfly's repetition optimization from causing the issue.
The decoding error when Natlink in or out of process.
- @tlappas reports in process it works as expected.
- @OwenMyers reports in/out of process he still experiences the error.

drmfinlay commented 4 years ago

Thanks for putting this together @LexiconCode! :+1:

I have been trying to figure out if this is a bug with Natlink or Dragonfly. So far, I can't find anything obvious that could be causing this. Natlink's code for results just forwards the rule integer value (dwCFGParse) given by Dragon to the results callbacks of grammars. It doesn't change it, neither does Dragonfly's code.

The only possibilities I see here are: 1) this is a Dragon bug that only occurs for some people; or 2) these words are somehow being passed along with the grammar.

For case two, I put together some code that constructs the set of all command words in loaded grammars. It might be of some use.

from dragonfly import get_engine
from dragonfly.grammar.elements_basic import Literal, ListBase

all_commands = set()
for grammar in get_engine().grammars:
    commands = set()
    for rule in grammar.rules:
        for element in grammar._get_element_list(rule):
            if isinstance(element, Literal):
                commands.add(" ".join(element.words))
            elif isinstance(element, ListBase):
                commands.extend(element.list.get_list_items())
        all_commands.update(commands)

    print("Grammar {!r} has {} unique command phrases.".format(grammar.name,
                                                               len(commands)))

print("Total unique command phrases: {}.".format(len(all_commands)))
print("Command phrases include {!r}: {}".format("some words",
                                                "some words" in all_commands))

drmfinlay commented 4 years ago

I'm going to partially fix this by checking recognised words against a set of the grammar and list words. That should fix the problem for most cases.

Ideally, the parser should be rewritten, but doing so would be quite difficult and time-consuming. Repetition elements and CCR make this a difficult problem to solve efficiently.

drmfinlay commented 4 years ago

This will be mostly fixed in release version 0.25.0.

I have removed the NatLink label because this is a quirk or bug with Dragon. Ryan Hileman has kindly confirmed this for us on Gitter.

dictation-toolbox / dragonfly

Dragonfly Failed to decode recognition #242