Fast, lightweight, modular - speech recognition for gaming
Tacspeak has been designed specifically for recognising speech commands while playing games, particularly system resource and FPS hungry games!
Fast - typically on the order of 10-50ms, from detected speech end (VAD) to action.
Lightweight - it runs on CPU, with ~2GB RAM.
Modular - you can build your own set of voice commands for additional games, or modify existing ones.
Open source - you can modify any part of Tacspeak for yourself, and/or contribute back to the project and help build it as part of the community.
Tacspeak is built atop the excellent Dragonfly speech recognition framework for Python.
Also built atop the excellent Kaldi Active Grammar, which provides the Kaldi (also excellent) engine backend and model for Dragonfly.
./tacspeak.exe
./tacspeak/user_settings.py
./tacspeak/grammar/_readyornot.py
./kaldi_model/Dictation.fst
- if not you need to download and extract the pre-trained modelTacspeak/tacspeak.exe
:)Run tacspeak.exe
(or python ./cli.py
) and it will...
./tacspeak/user_settings.py
./tacspeak/grammar/_*.py
grammar
modules), then activate those relevant modules.listen_key
to be activated if it's specified, and depending on listen_key_toggle
(toggle mode).Also:
tacspeak.exe
./tacspeak/user_settings.py
to your liking.
./tacspeak/grammar/_*.py
, e.g. keybindings.
Please use caution and your own discretion when installing or using any third-party files, specifically *.py files. Don't install or use files from untrustworthy sources.
Tacspeak automatically loads (and executes) ./tacspeak/user_settings.py
and all modules ./tacspeak/grammar/_*.py
, regardless of what code it contains.
It is highly recommended to review and adjust ./tacspeak/user_settings.py to your liking.
Open ./tacspeak/user_settings.py
in a text editor, change the settings, then save and overwrite the file. There are comments in the file explaining most of the important settings.
For example, you might want to change these:
listen_key
=0x05
0x05
= mouse thumb button 1.0x10
= Shift key.None
= overrides listen_key_toggle
, and sets it into always listening mode; uses Voice Activity Detector (VAD) to detect end of speech and recognise commands.listen_key_toggle
=-1
0
or -1
. 0
for toggle mode off, listen only while key is pressed; must release key for the command to be recognised.1
for toggle mode on, key press toggles listening on/off; must toggle off for the command to be recognised.2
for global toggle mode on, key press toggles listening on/off, but it uses Voice Activity Detector (VAD) to detect end of speech and recognise commands so you don't have to toggle off to recognise commands.-1
for toggle mode off + priority, listen only while key is pressed, except always listen for priority grammar ("freeze!") even when key is not pressed.listen_key_padding_end_ms_min
=1
1
if using listen_key_toggle
0
or -1
; set to 0
for anything else.listen_key
is released (or toggled off), after which if VAD detects silence it will stop capturing.listen_key_padding_end_ms_max
=170
170
if using listen_key_toggle
0
or -1
; set to 0
for anything else.listen_key
is released (or toggled off), but will stop short if VAD detects silence.listen_key_padding_end_always_max
=False
listen_key_padding_end_ms_max
of audio after listen_key
is released (or toggled off)vad_padding_end_ms
=250
150
if using listen_key_toggle
0
or -1
; set to 250
for anything else.audio_input_device
=None
USE_NOISE_SINK
=True
False
if you're having issues with recognition accuracy.retain_dir
= ./retain/
retain_audio
= True
retain_dir
. Disabled by default.retain_metadata
= True
.tsv
file in retain_dir
. Disabled by default.retain_approval_func
= my_retain_func
True
or False
based on AudioStoreEntry
contents. Disabled by default.It is likely you will want to modify or customise some of the existing Tacspeak grammar modules (if not also add your own!), which you can do by editing the ./tacspeak/grammar/_*.py
file corresponding to the application you're interested in.
As an example, in the Ready or Not module you can change ingame_key_bindings
to align the Tacspeak module with your in-game keybindings.
You could also change the words and/or sentences used for recognising speech commands, for example, adding "smoke it out" as an alternative to "breach and clear".
Additional notes:
See kaldi_model/README.md for more information.
Things to check or try first:
./tacspeak.exe
./tacspeak/user_settings.py
./tacspeak/grammar/_readyornot.py
./kaldi_model/Dictation.fst
- if not you need to download and extract the pre-trained modeltacspeak.exe
?listen_key
(by default it is mouse thumb button), and does it show "Hot mic" in the console?listen_key
(default is mouse thumb button), speaking, then releasing after you finish speaking?./tacspeak/grammar/_readyornot.py
. It's set for default game keybindings../tacspeak/user_settings.py
or ./tacspeak/grammar/_readyornot.py
keep it all default, try running tacspeak.exe
../tacspeak.exe --print_mic_list
in Powershell or command prompt.
audio_input_device
setting in ./tacspeak/user_settings.py
. ./tacspeak/grammar/_readyornot.py
.USE_NOISE_SINK
(in ./tacspeak/user_settings.py
) to True
or False
if you're getting too many false positive or false negative speech recognitions respectively.retain_dir
, retain_audio
and retain_metadata
(in ./tacspeak/user_settings.py
) appropriately.Tacspeak/
.Tacspeak/kaldi_model/
after extraction.Tacspeak/
folder in PowerShell (or equivalent).Tacspeak
folder: python -m venv "./.venv"
Tacspeak
folder: ./.venv/Scripts/Activate.ps1
pip install -r requirements.txt
python ./cli.py
portaudio_x64.dll
, build from source here using docs here, or download hereTacspeak/
.Tacspeak/
folder in PowerShell, keep it as your current working directory../.venv
, e.g.
Tacspeak
folder: python -m venv "./.venv"
Tacspeak
folder: ./.venv/Scripts/Activate.ps1
scripts\setup_for_build.ps1
in PowerShell. This will download and install dependencies via running the following scripts:
scripts\build_app.ps1
in PowerShellTacspeak/
.Tacspeak/kaldi_model/
.Tacspeak/
folder in PowerShell (or equivalent).Tacspeak
folder: python -m venv "./.venv"
Tacspeak
folder: ./.venv/Scripts/Activate.ps1
pip install -r requirements.txt
portaudio_x64.dll
to libportaudio64bit.dll
, copy and paste overwriting the existing file located at ./venv/Lib/site-packages/_sounddevice_data/portaudio-binaries/libportaudio64bit.dll
.python setup.py build
I built Tacspeak because I was fed-up with how poorly accurate the Windows Speech Recognition engine was with my voice, even after training. No other alternatives I tested (there were many) fit exactly what I wanted from speech recognition while gaming.
What I learned from my research and testing:
Tacspeak isn't perfect, but it is a very strong option, precisely because it can be so highly customised to your specific commands, for your specific application.
Issues, suggestions, and feature requests are welcome.
Pull requests are considered, but be warned the project structure is in flux and there may be breaking changes to come.
We'd also like some (TBD) quality testing be done on grammar modules before they're brought into the project. If you can help define what we mean by "some (TBD) quality testing"... well, trailblazers are welcome!
Tacspeak uses a modified version of Dragonfly located at jwebmeister/dragonfly. This is where the heart of the beast (bugs) lives... please help slay it!
Also, be warned the project structure is in flux and there may be breaking changes there too.
You can also consider supporting the projects Tacspeak are built upon, dictation-toolbox/dragonfly and daanzu/kaldi-active-grammar.
Any and all donations are very much appreciated and help encourage development.
This project is licensed under the GNU Affero General Public License v3 (AGPL-3.0-or-later). See the LICENSE.txt file for details.