jwebmeister / tacspeak

Tacspeak - Fast, lightweight, modular speech recognition for gaming
GNU Affero General Public License v3.0
42 stars 2 forks source link

Tacspeak

Fast, lightweight, modular - speech recognition for gaming

GithubDownloads NexusmodsModPage Discord

Donate Donate

Introduction

Tacspeak has been designed specifically for recognising speech commands while playing games, particularly system resource and FPS hungry games!

Fast - typically on the order of 10-50ms, from detected speech end (VAD) to action.

Lightweight - it runs on CPU, with ~2GB RAM.

Modular - you can build your own set of voice commands for additional games, or modify existing ones.

Open source - you can modify any part of Tacspeak for yourself, and/or contribute back to the project and help build it as part of the community.

Watch the video demo of me using Tacspeak while playing Ready or Not


Tacspeak is built atop the excellent Dragonfly speech recognition framework for Python.

Also built atop the excellent Kaldi Active Grammar, which provides the Kaldi (also excellent) engine backend and model for Dragonfly.

Requirements

Basic install - packaged executable

Watch the video demo of me downloading and install Tacspeak

  1. Download and install Microsoft Visual C++ Redistributable
  2. Download the latest release, including both (they are separate downloads and/or releases):
    • the Tacspeak application .zip (includes .exe)
    • a pre-trained Kaldi model .zip (includes kaldi_model folder).
  3. Extract the Tacspeak application .zip into a folder, and extract the Kaldi model .zip into the same folder. Check the following files exists:
    • ./tacspeak.exe
    • ./tacspeak/user_settings.py
    • ./tacspeak/grammar/_readyornot.py
    • ./kaldi_model/Dictation.fst - if not you need to download and extract the pre-trained model
  4. Run the executable Tacspeak/tacspeak.exe :)

Usage

Basic

Watch the video Tacspeak getting starting guide how to use & change settings (basic)

Run tacspeak.exe (or python ./cli.py) and it will...

Also:

Important advisory

Please use caution and your own discretion when installing or using any third-party files, specifically *.py files. Don't install or use files from untrustworthy sources.

Tacspeak automatically loads (and executes) ./tacspeak/user_settings.py and all modules ./tacspeak/grammar/_*.py, regardless of what code it contains.

User settings

It is highly recommended to review and adjust ./tacspeak/user_settings.py to your liking.

Open ./tacspeak/user_settings.py in a text editor, change the settings, then save and overwrite the file. There are comments in the file explaining most of the important settings.

For example, you might want to change these:

Grammar modules

It is likely you will want to modify or customise some of the existing Tacspeak grammar modules (if not also add your own!), which you can do by editing the ./tacspeak/grammar/_*.py file corresponding to the application you're interested in.

As an example, in the Ready or Not module you can change ingame_key_bindings to align the Tacspeak module with your in-game keybindings.
You could also change the words and/or sentences used for recognising speech commands, for example, adding "smoke it out" as an alternative to "breach and clear".

Additional notes:

Models

See kaldi_model/README.md for more information.

Troubleshooting

Things to check or try first:

Advanced install - Python

Prerequisites:

  1. Microsoft Visual C++ Redistributable installed
  2. Python 3.11 installed

Steps:

  1. Clone this repo into a folder, e.g. Tacspeak/.
  2. Download a pre-trained Kaldi model .zip from the latest release and extract into the cloned project folder, e.g. Tacspeak/kaldi_model/ after extraction.
  3. Open the Tacspeak/ folder in PowerShell (or equivalent).
  4. Strongly recommended to use a virtual environment, e.g.
    • create within Tacspeak folder: python -m venv "./.venv"
    • activate within Tacspeak folder: ./.venv/Scripts/Activate.ps1
  5. Install required packages via pip
    • pip install -r requirements.txt
  6. Done! Should now be able to run Tacspeak via python ./cli.py

Build instructions

Prerequisites:

  1. Microsoft Visual C++ Redistributable installed
  2. Python 3.11 installed
  3. A compatible compiler for cx_freeze installed,
    • Only tested Visual Studio 2022, MSVC
  4. (Optional, but necessary for releases) PortAudio v19.7.0, portaudio_x64.dll, build from source here using docs here, or download here

Steps - Option 1

  1. Clone this repo into a folder, e.g. Tacspeak/.
  2. Open the Tacspeak/ folder in PowerShell, keep it as your current working directory.
  3. Create and activate a python virtual environment in directory ./.venv, e.g.
    • create within Tacspeak folder: python -m venv "./.venv"
    • activate within Tacspeak folder: ./.venv/Scripts/Activate.ps1
  4. Run scripts\setup_for_build.ps1 in PowerShell. This will download and install dependencies via running the following scripts:
    • scripts\pip_reinstall_all.ps1
    • scripts\download_replace_portaudio_x64_dll.ps1
    • scripts\download_extract_model.ps1
    • scripts\move_extracted_model.ps1
    • scripts\generate_all_licenses.ps1
  5. Run scripts\build_app.ps1 in PowerShell

Steps - Option 2

  1. Clone this repo into a folder, e.g. Tacspeak/.
  2. Download a pre-trained Kaldi model .zip from the latest release and extract into the cloned project folder, e.g. Tacspeak/kaldi_model/.
  3. Open the Tacspeak/ folder in PowerShell (or equivalent).
  4. Strongly recommended to use a virtual environment, e.g.
    • create within Tacspeak folder: python -m venv "./.venv"
    • activate within Tacspeak folder: ./.venv/Scripts/Activate.ps1
  5. Install required packages via pip
    • pip install -r requirements.txt
  6. (Optional, but necessary for releases) rename portaudio_x64.dll to libportaudio64bit.dll, copy and paste overwriting the existing file located at ./venv/Lib/site-packages/_sounddevice_data/portaudio-binaries/libportaudio64bit.dll.
  7. Build via setup.py
    • python setup.py build

Motivation

I built Tacspeak because I was fed-up with how poorly accurate the Windows Speech Recognition engine was with my voice, even after training. No other alternatives I tested (there were many) fit exactly what I wanted from speech recognition while gaming.

What I learned from my research and testing:

Tacspeak isn't perfect, but it is a very strong option, precisely because it can be so highly customised to your specific commands, for your specific application.

Contributing

Issues, suggestions, and feature requests are welcome.

Pull requests are considered, but be warned the project structure is in flux and there may be breaking changes to come.
We'd also like some (TBD) quality testing be done on grammar modules before they're brought into the project. If you can help define what we mean by "some (TBD) quality testing"... well, trailblazers are welcome!

Tacspeak uses a modified version of Dragonfly located at jwebmeister/dragonfly. This is where the heart of the beast (bugs) lives... please help slay it!
Also, be warned the project structure is in flux and there may be breaking changes there too.

You can also consider supporting the projects Tacspeak are built upon, dictation-toolbox/dragonfly and daanzu/kaldi-active-grammar.

Any and all donations are very much appreciated and help encourage development.

Donate Donate

Author

License

This project is licensed under the GNU Affero General Public License v3 (AGPL-3.0-or-later). See the LICENSE.txt file for details.

Acknowledgments