jwebmeister / tacspeak

Tacspeak - Fast, lightweight, modular speech recognition for gaming
GNU Affero General Public License v3.0
44 stars 2 forks source link

Audio output stops after application runs #2

Closed ItStewball closed 9 months ago

ItStewball commented 9 months ago

Unsure what causes it but when you run tacspeak the program will cause all audio outputs going through your device to be cut off and will either need to restart said applications or cycle the output devices to re-enable the audio on applications.

Is this a known issue or is this more of a software/hardware fault that can be easily resolved with some tinkering with software?

jwebmeister commented 9 months ago

It does seem to cycle the audio device, but I haven't had it fail to automatically restart / resume audio.

Not sure it'll affect anything, but when I'm using Tacspeak with Ready or Not, I've typically run the game first in Fullscreen mode, then alt-tab and open a Terminal / Powershell, and run tacspeak.exe in the Terminal window. You could also experiment seeing if running in Fullscreen instead of Borderless Window (or vice versa) affects anything, particularly if you alt-tab from the game (though I haven't experience any issues using alt-tab myself).

In early-access I've had Ready or Not audio fail (or partially fail) to start without Tacspeak running, i.e. a game bug. Do you think it's a game bug, or are you confident it's Tacspeak?

jwebmeister commented 9 months ago

I've released v0.1.1 of Tacspeak adding a new argument that can be used via command (in powershell or command prompt) ./tacspeak.exe --print_mic_list to print a list of the audio devices on your system.

Can you have a look at what your input device is in the list and what the interface type is? It'll be signified by > before the device index name, interface type (# in, # out).

e.g. "> 1 VoiceMeeter Output (VB-Audio Vo, MME (2 in, 0 out)" - MME is the interface type

ItStewball commented 9 months ago

How would I run the line of code with the application?

I don't have a lot of knowledge of debugging software and I'm unsure as to where in PowerShell or command prompt I need to do it

pasting the code directly into normal powershell/cmd does nothing, only returning an error stating it does not recognise the command (Which I think is expected)

jwebmeister commented 9 months ago

First, try going through all the steps in the troubleshooting section of the README.md here on GitHub.

Second, download the 0.1.1 release, extract the Tacspeak application and files, extract the pre-trained model with folder “kaldi_model” into the same folder, open powershell into the folder with “tacspeak.exe” (or navigate there via ‘cd c:\tacspeak\’ or similar), run the command ‘./tacspeak.exe --print_mic_list’ in powershell.

ItStewball commented 9 months ago

Thank you for updating and informing me how to use powershell to get the details

The inputs and outputs for my device are

1 Microphone (Razer Seiren Emote), MME (2 in, 0 out) < 3 Headphones (7- Arctis 7 Game), MME (0 in, 2 out)

There are other devices added connected to my system (38 of them) but they're all disabled

The console with Debug Mode shows the following (This is using version 0.1.2, the latest release as of sending this comment)

engine (INFO): Initialized 'kaldi' SR engine: KaldiEngine(). engine (INFO): Loading Kaldi-Active-Grammar v3.1.0 in process 23464. engine (INFO): Kaldi options: {'model_dir': None, 'tmp_dir': None, 'audio_input_device': None, 'audio_self_threaded': True, 'audio_auto_reconnect': True, 'audio_reconnect_callback': None, 'retain_dir': None, 'retain_audio': False, 'retain_metadata': False, 'retain_approval_func': None, 'vad_aggressiveness': 3, 'vad_padding_start_ms': 150, 'vad_padding_end_ms': 250, 'vad_complex_padding_end_ms': 600, 'auto_add_to_user_lexicon': False, 'allow_online_pronunciations': False, 'lazy_compilation': True, 'invalidate_cache': False, 'expected_error_rate_threshold': None, 'alternative_dictation': None, 'compiler_init_config': {}, 'decoder_init_config': {}, 'listen_key': 5, 'listen_key_toggle': -1} engine (INFO): streaming audio from 'Microphone (Razer Seiren Emote)' using MME: 16000 sample_rate, 10 block_duration_ms, 30 latency_ms directory (INFO): Looking for command modules here: C:\Users\matthew\Downloads\tacspeak_0.1.2\tacspeak\grammar directory (INFO): Valid paths: C:\Users\matthew\Downloads\tacspeak_0.1.2\tacspeak\grammar_readyornot.py module (INFO): CommandModule('_readyornot.py'): Loading module: 'C:\Users\matthew\Downloads\tacspeak_0.1.2\tacspeak\grammar_readyornot.py' -- Ready or Not keybindings -- gold:'debug_print_key'() blue:'debug_print_key'() red:'debug_print_key'() alpha:'debug_print_key'() bravo:'debug_print_key'() charlie:'debug_print_key'() delta:'debug_print_key'() cmd_1:'debug_print_key'() cmd_2:'debug_print_key'() cmd_3:'debug_print_key'() cmd_4:'debug_print_key'() cmd_5:'debug_print_key'() cmd_6:'debug_print_key'() cmd_7:'debug_print_key'() cmd_8:'debug_print_key'() cmd_9:'debug_print_key'() cmd_back:'debug_print_key'() cmd_hold:'debug_print_key'() cmd_default:'debug_print_key'() cmd_menu:'debug_print_key'() interact:'debug_print_key'() yell:'debug_print_key'() -- Ready or Not keybindings -- engine (INFO): Loading grammar ReadyOrNot engine (INFO): Loading grammar ReadyOrNot_priority engine (INFO): Loading grammar _recobs_grammar Ready to listen... engine (INFO): Listening... engine (INFO): Cold mic

If you wish to see my user settings, this is what I have

(The only line changed in the settings was DEBUG_MODE)

DEBUG_MODE = True DEBUG_HEAVY_DUMP_GRAMMAR = False # expensive on memory, don't set this to True unless you're sure KALDI_ENGINE_SETTINGS = { "listen_key":0x05, # 0x10=SHIFT key, 0x05=X1 mouse button, 0x06=X2 mouse button, see https://learn.microsoft.com/en-us/windows/win32/inputdev/virtual-key-codes "listen_key_toggle":-1, # Recommended is 0 or -1. 0 for toggle mode off; 1 for toggle mode on; 2 for global toggle on (use VAD); -1 for toggle mode off but allow priority grammar even when key not pressed "vad_padding_end_ms":250, # ms of required silence after VAD "auto_add_to_user_lexicon":False, # this requires g2p_en (which isn't installed by default) "allow_online_pronunciations":False,

"input_device_index":None, # set to an int to choose a non-default microphone

# "vad_aggressiveness":3, # default aggressiveness of VAD
# "vad_padding_start_ms":150, # default ms of required silence before VAD
# "model_dir":'kaldi_model', # default model directory
# "tmp_dir":None, 
# "audio_input_device":None, 
# "audio_self_threaded":True, 
# "audio_auto_reconnect":True, 
# "audio_reconnect_callback":None,
# "retain_dir":None, # set to a writable directory path to retain recognition metadata and/or audio data
# "retain_audio":None, # set to True to retain speech data wave files in the retain_dir (if set)
# "retain_metadata":None, 
# "retain_approval_func":None,
# "vad_complex_padding_end_ms":600, # default ms of required silence after VAD for complex utterances
# "lazy_compilation":True, # set to True to parallelize & speed up loading
# "invalidate_cache":False,
# "expected_error_rate_threshold":None,
# "alternative_dictation":None,
# "compiler_init_config":None, 
# "decoder_init_config":None,

}

When I tried to run the application it seemed to run as intended once but then went back to doing the same issue as before, I'll see if I can try get the software to function as intended but this will take a bit of time

jwebmeister commented 9 months ago

Can you please test something (it’s stupid, but) - if you run tacspeak.exe from powershell or command prompt, do you still get the same issue? As in open powershell or command prompt, “cd tacspeak_app_dir”, “./tacspeak.exe”.

Another thing you can try is setting “input_device_index” in “user_settings.py” to the corresponding index number for the same input device name but a different interface (if one exists), e.g. <input_device_index> Microphone (Razer Seiren Emote), <not MME interface> (2 in, 0 out). Make sure to uncomment the line (remove the preceding # in front of “input_device_index”)

ItStewball commented 9 months ago

When attempting to run tacspeak through CMD or powershell using the line below "C:\Windows\System32>C:\Users\matthew\Downloads\tacspeak_0.1.2\tacspeak.exe"

It gives this output

Tacspeak version 0.1.2

Tacspeak - speech recognition for gaming
© Copyright 2023 by Joshua Webb

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License
along with this program.  If not, see <https://www.gnu.org/licenses/>.

CommandModule('user_settings.py'): Error loading module: [Errno 2] No such file or directory: 'C:\Windows\System32\tacspeak\user_settings.py' Traceback (most recent call last): File "C:\dev\tacspeak.venv\Lib\site-packages\dragonfly\loader.py", line 77, in load File "", line 936, in exec_module File "", line 1073, in get_code File "", line 1130, in get_data FileNotFoundError: [Errno 2] No such file or directory: 'C:\Windows\System32\tacspeak\user_settings.py' Failed to load tacspeak/user_settings.py DEBUG_MODE. Using default settings as fallback. Failed to load tacspeak/user_settings.py DEBUG_HEAVY_DUMP_GRAMMAR. Using default settings as fallback. Failed to load tacspeak/user_settings.py KALDI_ENGINE_SETTINGS. Using default settings as fallback. engine (INFO): Initialized 'kaldi' SR engine: KaldiEngine(). engine (INFO): Loading Kaldi-Active-Grammar v3.1.0 in process 24952. engine (INFO): Kaldi options: {'model_dir': None, 'tmp_dir': None, 'audio_input_device': None, 'audio_self_threaded': True, 'audio_auto_reconnect': True, 'audio_reconnect_callback': None, 'retain_dir': None, 'retain_audio': False, 'retain_metadata': False, 'retain_approval_func': None, 'vad_aggressiveness': 3, 'vad_padding_start_ms': 150, 'vad_padding_end_ms': 250, 'vad_complex_padding_end_ms': 600, 'auto_add_to_user_lexicon': False, 'allow_online_pronunciations': False, 'lazy_compilation': True, 'invalidate_cache': False, 'expected_error_rate_threshold': None, 'alternative_dictation': None, 'compiler_init_config': {}, 'decoder_init_config': {}, 'listen_key': 16, 'listen_key_toggle': 0} Traceback (most recent call last): File "C:\dev\tacspeak.venv\Lib\site-packages\cx_Freeze\initscripts__startup__.py", line 124, in run File "C:\dev\tacspeak.venv\Lib\site-packages\cx_Freeze\initscripts\console.py", line 16, in run File "cli.py", line 59, in File "cli.py", line 36, in main File "C:\dev\tacspeak\tacspeak__main.py", line 115, in main File "C:\dev\tacspeak.venv\Lib\site-packages\dragonfly\engines\backend_kaldi\engine.py", line 196, in connect File "C:\dev\tacspeak.venv\Lib\site-packages\dragonfly\engines\backend_kaldi\compiler.py", line 71, in init File "C:\dev\tacspeak.venv\Lib\site-packages\kaldi_active_grammar\compiler.py", line 266, in init File "C:\dev\tacspeak.venv\Lib\site-packages\kaldi_active_grammar\model.py", line 192, in init__ kaldi_active_grammar.KaldiError: cannot find model_dir: 'kaldi_model\'

The Kaldi model does exist and it functions as intended if I run the software through the executable directly

when running "input_device_index" it gives me the output "engine (WARNING): KaldiEngine(): input_device_index is deprecated; please use audio_input_device", this doesn't cause the software to crash but it's just something I thought would be interesting

the output the console gives when I set my input_device_index to my microphone is

engine (INFO): Initialized 'kaldi' SR engine: KaldiEngine(). engine (INFO): Loading Kaldi-Active-Grammar v3.1.0 in process 1288. engine (INFO): Kaldi options: {'model_dir': None, 'tmp_dir': None, 'audio_input_device': 1, 'audio_self_threaded': True, 'audio_auto_reconnect': True, 'audio_reconnect_callback': None, 'retain_dir': None, 'retain_audio': False, 'retain_metadata': False, 'retain_approval_func': None, 'vad_aggressiveness': 3, 'vad_padding_start_ms': 150, 'vad_padding_end_ms': 250, 'vad_complex_padding_end_ms': 600, 'auto_add_to_user_lexicon': False, 'allow_online_pronunciations': False, 'lazy_compilation': True, 'invalidate_cache': False, 'expected_error_rate_threshold': None, 'alternative_dictation': None, 'compiler_init_config': {}, 'decoder_init_config': {}, 'listen_key': 5, 'listen_key_toggle': 0} engine (INFO): streaming audio from 'Microphone (Razer Seiren Emote)' using MME: 16000 sample_rate, 10 block_duration_ms, 30 latency_ms directory (INFO): Looking for command modules here: C:\Users\matthew\Downloads\tacspeak_0.1.2\tacspeak\grammar directory (INFO): Valid paths: C:\Users\matthew\Downloads\tacspeak_0.1.2\tacspeak\grammar_readyornot.py module (INFO): CommandModule('_readyornot.py'): Loading module: 'C:\Users\matthew\Downloads\tacspeak_0.1.2\tacspeak\grammar_readyornot.py'

The microphone works perfectly, it's just output audio that is continuing to cut off. The best way to resolve it for me was to disable the audio device and then re-enable it

jwebmeister commented 9 months ago

From a user on nexusmods:

I solved this problem by uncheck the "Give exclusive mode applications priority" checkbox in the "Advanced" pannel in the properties window of the output device that I'm currently using, hope this will help some guys that have the same issue like me

Follow these steps to disable Exclusive Mode.

  1. Right-click the Speaker icon on the Windows toolbar, and select Open Sound settings.
  2. Click Device properties located underneath Choose your output device, then click Additional device properties located underneath Related Settings.
  3. In the Line Properties window, click the Advanced tab, then uncheck Allow applications to take exclusive control of this device.
  4. Click Apply, then click OK.
ItStewball commented 9 months ago

Hey, that seemed to fix it!

Thank you for helping me fix the issue!