dictation-toolbox / dragonfly

Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx
GNU Lesser General Public License v3.0
388 stars 75 forks source link

get_current_engine() says Sapi5InProcEngine() even tho I have loaded Kaldi #330

Closed Treshank closed 3 years ago

Treshank commented 3 years ago

I'm not sure if this is an issue or I'm making a mistake somewhere, but every time I seem to import Grammar in a file and create functions to load them in the main.py, the engine changes Kaldi to sapi5 and I also get this error that Natlink requires 32bit python. Then no speech is detected at all. However if I create, add and load the grammars directly in main.py it works.

main.py is the same as kaldi_module_loader_plus.py except for the importing loadL1 and unloadL1 and using them when the system is awake and sleeping

I also found this issue occurring when I use Integer or IntegerRef.


# main.py
"""
Command-module loader for Kaldi.

This script is based on 'dfly-loader-wsr.py' written by Christo Butcher and
has been adapted to work with the Kaldi engine instead.

This script can be used to look for Dragonfly command-modules for use with
the Kaldi engine. It scans the directory it's in and loads any ``_*.py`` it
finds.
"""

# TODO Have a simple GUI for pausing, resuming, cancelling and stopping
# recognition, etc

from __future__ import print_function

import logging
import os.path
import sys

import six

from dragonfly import get_engine, get_current_engine
from dragonfly import Grammar, MappingRule, Function, Dictation, FuncContext
from dragonfly.loader import CommandModuleDirectory
from dragonfly.log import setup_log
# from CustomGrammars import l1Grammars
from grammarContoller import loadL1, unloadL1

# --------------------------------------------------------------------------
# Set up basic logging.

if False:
    # Debugging logging for reporting trouble
    logging.basicConfig(level=10)
    logging.getLogger('grammar.decode').setLevel(20)
    logging.getLogger('grammar.begin').setLevel(20)
    logging.getLogger('compound').setLevel(20)
    logging.getLogger('kaldi.compiler').setLevel(10)
else:
    setup_log()

# --------------------------------------------------------------------------
# User notification / rudimentary UI. MODIFY AS DESIRED

# For message in ('sleep', 'wake')
def notify(message):
    if message == 'sleep':
        print("Sleeping...")
        # get_engine().speak("Sleeping")
    elif message == 'wake':
        print("Awake...")
        # get_engine().speak("Awake")

# --------------------------------------------------------------------------
# Sleep/wake grammar. (This can be unused or removed if you don't want it.)

sleeping = False

def load_sleep_wake_grammar(initial_awake):
    sleep_grammar = Grammar("sleep")
    def sleep(force=False):
        global sleeping
        if not sleeping or force:
            sleeping = True
            unloadL1()
            sleep_grammar.set_exclusiveness(True)
        notify('sleep')

    def wake(force=False):
        global sleeping
        if sleeping or force:
            sleeping = False
            loadL1()
            sleep_grammar.set_exclusiveness(False)
        notify('wake')

    class SleepRule(MappingRule):
        mapping = {
            "start listening":  Function(wake) + Function(lambda: get_engine().start_saving_adaptation_state()),
            "stop listening":   Function(lambda: get_engine().stop_saving_adaptation_state()) + Function(sleep),
            "halt listening":   Function(lambda: get_engine().stop_saving_adaptation_state()) + Function(sleep),
        }
    sleep_grammar.add_rule(SleepRule())

    sleep_noise_rule = MappingRule(
        name = "sleep_noise_rule",
        mapping = { "<text>": Function(lambda text: False and print(text)) },
        extras = [ Dictation("text") ],
        context = FuncContext(lambda: sleeping),
    )
    sleep_grammar.add_rule(sleep_noise_rule)

    sleep_grammar.load()

    if initial_awake:
        wake(force=True)
    else:
        sleep(force=True)

# --------------------------------------------------------------------------
# Main event driving loop.

def main():
    logging.basicConfig(level=logging.INFO)

    try:
        path = os.path.dirname(__file__)
    except NameError:
        # The "__file__" name is not always available, for example
        # when this module is run from PythonWin.  In this case we
        # simply use the current working directory.
        path = os.getcwd()
        __file__ = os.path.join(path, "kaldi_module_loader_plus.py")

    # Set any configuration options here as keyword arguments.
    # See Kaldi engine documentation for all available options and more info.
    engine = get_engine('kaldi',
        model_dir='../kaldi_model',  # default model directory
        # vad_aggressiveness=3,  # default aggressiveness of VAD
        # vad_padding_start_ms=150,  # default ms of required silence before VAD
        # vad_padding_end_ms=150,  # default ms of required silence after VAD
        # vad_complex_padding_end_ms=500,  # default ms of required silence after VAD for complex utterances
        # input_device_index=None,  # set to an int to choose a non-default microphone
        lazy_compilation=True,  # set to True to parallelize & speed up loading
        # retain_dir=None,  # set to a writable directory path to retain recognition metadata and/or audio data
        # retain_audio=None,  # set to True to retain speech data wave files in the retain_dir (if set)
    )

    # Call connect() now that the engine configuration is set.
    engine.connect()

    # Load grammars.
    load_sleep_wake_grammar(True)

    directory = CommandModuleDirectory(path, excludes=[__file__])
    directory.load()

    # Define recognition callback functions.
    def on_begin():
        print("Speech start detected.")

    def on_recognition(words):
        message = u"Recognized: %s" % u" ".join(words)

        # This only seems to be an issue with Python 2.7 on Windows.
        if six.PY2:
            encoding = sys.stdout.encoding or "ascii"
            message = message.encode(encoding, errors='replace')
        print(message)

    def on_failure():
        print("Sorry, what was that?")

    # Start the engine's main recognition loop
    engine.prepare_for_recognition()
    try:
        print("Listening...")
        print(get_current_engine())
        engine.do_recognition(on_begin, on_recognition, on_failure)
    except KeyboardInterrupt:
        pass

if __name__ == "__main__":
    main()
# grammarContoller.py File

from dragonfly import Grammar
from CustomGrammars import l1Grammars

l1_grammar = Grammar('Level 1 Grammar')
l1_grammar.add_rule(l1Grammars.MainGrammarRules())
l1_grammar.add_rule(l1Grammars.FunctionGrammars())

def unloadL1():
    l1_grammar.unload()

def loadL1():
    l1_grammar.load()
daanzu commented 3 years ago

I haven't tested this myself to verify it, but I think your problem may be that you are importing and building the grammars before you have initially called get_engine. What if you move the import line to be after that? By the way, you don't actually need to load and unload your grammars for the wake/sleep: that grammar is specially marked as exclusive, so it should automatically effectively disable all other grammars (that are not also marked as exclusive) whenever it is asleep.

Treshank commented 3 years ago

I haven't tested this myself to verify it, but I think your problem may be that you are importing and building the grammars before you have initially called get_engine. What if you move the import line to be after that? By the way, you don't actually need to load and unload your grammars for the wake/sleep: that grammar is specially marked as exclusive, so it should automatically effectively disable all other grammars (that are not also marked as exclusive) whenever it is asleep.

Seems to have solved the problem. I shall test it a little more and close the issue

Treshank commented 3 years ago

Thanks @daanzu Solved the problem

drmfinlay commented 3 years ago

Hmm. I realise this has been solved, but perhaps the get_engine() function should be adjusted. I think the Kaldi engine should be preferred over Sphinx and SAPI 5, if it is available.

@daanzu What do you think?

daanzu commented 3 years ago

@Danesprite I think that makes sense and would be a good change. Also, I wonder if there should be a conspicuous warning message or something when this happens, where get_engine is called automatically because a grammar is loaded or whatever? This has definitely occurred other times, and been a bit confusing.

drmfinlay commented 3 years ago

Okay then, I'll open a PR for it. An info message also sounds like a good idea to me. I'll change get_engine() to log a message when an SR engine is decided on and returned by the function. The message should only appear once when the engine is initialised.