jwebmeister / tacspeak

Tacspeak - Fast, lightweight, modular speech recognition for gaming
GNU Affero General Public License v3.0
44 stars 2 forks source link

Dynamic Phrase Loading, User Settings Error Handling, and Documentation Enhancements #5

Open Kapparina opened 9 months ago

Kapparina commented 9 months ago

This pull request introduces several key improvements, feature enhancements, and documentation extension efforts:

  1. Dynamic Phrase Loading:

    • Implemented dynamic loading of phrase mappings from phrases.toml file across several commands in tacspeak/grammar/_readyornot.py, replacing the previous hardcoded mapping dictionaries. This effort was facilitated by the addition of two new TOML files - phrases.toml and phrases_wip.toml.
    • Added tomllib and pathlib.Path imports to support this change.
  2. Improved User Settings Error Handling:

    • Improved error handling while loading tacspeak/user_settings.py in tacspeak/main.py. Now, for each attribute in user settings, specific error messages are shown, facilitating quicker identification and rectification of errors.
  3. Enhancements in Door Handling Functionality:

    • Introduced a new attribute 'trapped' in the cmd_door_options() function in tacspeak/grammar/_readyornot.py to enable door trapping options during gameplay.
  4. Documentation and Other Updates:

    • The README.md document was extended to include a download section for GitHub, Nexus Mods, and a link to join the project's Discord server. Detailed extraction instructions for Tacspeak and Kaldi model were added, and Microsoft Visual C++ Redistributable was added as a prerequisite.
    • Tacspeak's version update from 0.1.1 to 0.1.2 was reflected in tacspeak/__init__.py.
    • An unnecessary blank line was removed from setup.py.
    • A print statement to display the Tacspeak version was included in cli.py.
jwebmeister commented 9 months ago

I'm unsure if we should incorporate "Dynamic Phrase Loading" for the grammar modules with just the current scope.

Pros:

Cons:

Currently I'm failing to be imaginative (or be actually helpful, sorry) and think it's better to keep it as a simple single file, for now. But, what are your thoughts on the points I've raised?
Is it actually simple / possible / worth it to expand the scope and extract everything (within reason) that the user would want to customise into a data only file?

P.S. user_settings.py should definitely go (behind the woodshed and be shot, at some point) and be replaced with a data file.

P.P.S. there are a few more changes to the mappings / phrases that I made on "main" branch on the basis of improving speech recognition. One important one is "hold" being removed from the "hold" command (ironic right?), so it's now only "on my command" (or variations).

Kapparina commented 9 months ago

I'm unsure if we should incorporate "Dynamic Phrase Loading" for the grammar modules with just the current scope.

Pros:

  • Separate data from logic (good in general!)

I'm glad we're aligned on this.

  • Makes it easier for user to customise parts of the module, with less chance of user error
  • Makes it possible to develop a more user friendly interface for users to customise the module (do I dare dream of a GUI!?)

That was my goal: to strike a balance between easy to use and easy to customise - on the topic of GUIs, I have ample experience with libraries like PyQt/PySide6, so that's not off of the table as far as I'm concerned.

Cons:

  • Only partial separation of data, in terms of things the user would likely want to customise. Examples of what I think also needs to be considered (not an exhaustive list):

    • the "spec" (i.e. full sentence structure, or the full recognised spoken phrase structure), of an existing command
    • within the "spec", which "extras" (i.e. choices, or variables) are optional and/or included
    • the default values of "extras"
    • adding in a new command or function (well obviously, but I'm listing it here anyway!)

You make some valid points here. Let's give it some thought and see what we can come up with. My first thought is perhaps having generic filler words ("the", "a", "that", "this", etc.) be accounted for by the grammar model, and having keywords be the only thing a user needs to specify. Allowing users to specify which keywords are optional and which aren't is obviously important too.

  • Added complexity for certain types of user customisation

    • User has to go to and modify multiple files for certain types of customisation
    • Divergence from Dragonfly docs

As it stands, having everything in one file is definitely simpler from one perspective, but it raises the barrier for entry somewhat. I think separating functions into different files (e.g. user_settings.toml and config.py for loading user preferences, grammar.py for loading the grammar model, etc.) could work; I'm thinking we could leverage Pydantic for this.

Regarding divergence from Dragonfly's documentation - this project already uses a custom variant of Dragonfly, no? Whilst I believe Dragonfly's documentation is a good starting point, divergence is nothing documentation can't solve.

Currently I'm failing to be imaginative (or be actually helpful, sorry) and think it's better to keep it as a simple single file, for now. But, what are your thoughts on the points I've raised? Is it actually simple / possible / worth it to expand the scope and extract everything (within reason) that the user would want to customise into a data only file?

Someone has to play devil's advocate, and unwillingness to challenge something is tantamount to negligence. Let's continue exploring our options, and see where we can take this.

P.S. user_settings.py should definitely go (behind the woodshed and be shot, at some point) and be replaced with a data file.

I'll show Lenny the rabbits.

P.P.S. there are a few more changes to the mappings / phrases that I made on "main" branch on the basis of improving speech recognition. One important one is "hold" being removed from the "hold" command (ironic right?), so it's now only "on my command" (or variations).

I'd noticed some changes had been made, I'll update my fork accordingly.

Dynamic phrase loading will be moved to another branch for now, as it will require a large refactor, especially if we're considering a GUI/abstracting customisation away from the source code. In the interim, I'll put some focus into readability and maintainability and create a separate PR for that.