dictation-toolbox / dragonfly

Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx
GNU Lesser General Public License v3.0
375 stars 73 forks source link

Receiving and storing text with Python #379

Closed MathewYaldo closed 2 months ago

MathewYaldo commented 9 months ago

I understand that this tool is to primarily create custom commands, but is it also possible to just receive the speech output and store it with Python?

I want to be able to talk into a microphone and process or store all of the speech with Python, but I couldn’t find anything regarding this on the Dragon website, so I was wondering if this tool might be able to help with something like that? The primarily intention would be not to use the text for commands but just to transcribe all the speech and store it.

drmfinlay commented 9 months ago

Hello Mathew,

Thank you for opening this issue.

Dragonfly can be used to transcribe speech into text for you. Try placing the code below into a new file _transcriber.py in the Natlink MacroSystem folder. You'll have to toggle the microphone to load the module. Your speech should be transcribed into the dfly transcript.txt file in your home folder.

import os
from dragonfly import CompoundRule, Dictation, Grammar

class TranscriberRule(CompoundRule):
    spec = "<text>"
    extras = [Dictation("text")]
    out_dir = os.path.expanduser('~')
    out_file = os.path.join(out_dir, "dfly transcript.txt")

    def _process_recognition(self, node, extras):
        # Retrieve the recognized text and format it.
        text = extras.get("text").format()

        # Append the text to *out_file*.
        with open(self.out_file, "a") as f:
            f.write(text + "\n")

# Create a new Grammar object and add a TranscriberRule instance to it.
grammar = Grammar("Transcriber grammar")
grammar.add_rule(TranscriberRule())

# Load the grammar and set it as exclusive, meaning that the engine will
#  only recognize from this grammar (and any other exclusive grammar).
grammar.load()
grammar.set_exclusiveness(True)

# Unload function which will be called by natlink at unload time.
def unload():
    global grammar
    if grammar: grammar.unload()
    grammar = None

Dragon does ship with a transcription program, by the way. It is listed in the start menu as AutoTranscribe Folder Agent. Perhaps you missed it? It does require separate recording software, so it might not be what you want.

MathewYaldo commented 9 months ago

@drmfinlay Thank you, this looks like what I was looking for. I just had a couple more questions regarding setting everything up. I have looked at #337 but I confused regarding what the right environment setup should be.

I have also noticed that the docs at https://dragonfly2.readthedocs.io/en/latest/installation.html mention Python 2.7 32-bit and downloading some Natlink exe files through SourceForge. Should I try to follow this? Not sure if this would all work for Dragon 15 which is what I am on.

LexiconCode commented 9 months ago

@MathewYaldo

I've written about how to use the new setup for natlink. Most of the documentation there applies to dragonfly except for the caster package requirements. You still have to utilize a 32 bit python environment. The new natlink installer handles it for you. https://github.com/dictation-toolbox/Caster/issues/911