erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
1.16k stars 122 forks source link

Adding config classes and tests #395

Closed Paladinium closed 1 week ago

Paladinium commented 3 weeks ago

Adding config abstraction for simpler and type-safe config handling.

Paladinium commented 3 weeks ago

Hi @erew123 : As you know, I would like to add some config options for RVC. I realized that this isn't as simple as I expected. It looks like most of the config is loaded in tts_server.py and then passed via arguments to further functions as needed.

Here is the beginning of a refactoring where I propose to use config classes with fields for each config option to provide some kind of type safety. Loading (and later saving) of settings is abstracted. The idea is that wherever you are in the code, you can simply use e.g. AlltalkConfig.get_instance().branding to access the branding config.

The 1st commit just consist of outlining the config classes and dealing with loading. I also didn't try to enhance the property types (e.g. changing tgwui_narrator_enabled from string to bool) or introducing enums. Before I refactor a larger part of the code, I would like to hear your ideas about this approach in general.

erew123 commented 3 weeks ago

Hi @Paladinium

That certainly would be a hell of a task to get through. It is fair to say that the code has become somewhat unwieldy in places. My original goal (back on v1) was to try and make the code readable/accessible for any level of coder, in the hope that people would want to dive in and add/modify code and hopefully build on the project, but it has grown a little, shall we say, organically.

I cant specifically think of any downsides to refactoring and using a central load/management for the config and it would be great if you want to take a shot at getting through this.

The things I can think of that are touching the config file are:

script.py (most of the reading/updating of the config file) tts_server.py (mostly reading the config file, but can update it from the web page) \system\config\firstrun.py (flags firstrun_model to false) \system\config\firstrun_tgwui.py (flags firstrun_model to false) \system\gradio_pages\alltalk_welcome.py (flags firstrun_splash to false)

One additional change I have added into the config file is gradio_pages, which is basically just show/hide pages from the gradio interface https://github.com/erew123/alltalk_tts/blob/alltalkbeta/script.py#L2058-L2067 https://github.com/erew123/alltalk_tts/blob/alltalkbeta/script.py#L2250-L2263

image

    "gradio_pages": {
        "Generate_Help_page": false,
        "Voice2RVC_page": true,
        "TTS_Generator_page": false,
        "TTS_Engines_Settings_page": true,
        "alltalk_documentation_page": false,
        "api_documentation_page": true
    }

image

I popped that in a week or so ago.

I'm guessing you have figured that script.py and tts_server.py are running in separate python instances and that tts_server.py is reloaded when you swap tts engine or potentially its model file. So if you are up for the task of refactoring, the only things I can think would be, it may be worth having a file lock/check to see if the file was updated elsewhere and ensure multiple processes dont hit the file at the same time (unlikely, but I guess no harm in doing something like that).

Check if the config file every X to see if it has been modified by another process (like tts_server.py modifying while script.py is running). Compare the last modified time and reload if needed.

def __check_and_reload(self):
    """Check if config file has been modified and reload if needed"""
    current_time = time.time()
    if current_time - self.__last_read_time >= self.__file_check_interval:
        config_path = self.__this_dir / "confignew.json"
        try:
            mtime = config_path.stat().st_mtime
            if mtime > self.__last_read_time:
                self.__load_config()
                self.__last_read_time = current_time

Lock the the config file when multiple processes might try to write at once and maybe backup too?

    def save(self):
        """Save config with file locking to prevent race conditions"""
        config_path = self.__this_dir / "confignew.json"
        lock_path = config_path.with_suffix('.lock')

        with FileLock(lock_path):
            # Create backup if we think that's sensible
            if config_path.exists():
                backup_path = config_path.with_suffix('.backup')
                shutil.copy(config_path, backup_path)

            try:
                data = {
                    "branding": self.branding,
                    "delete_output_wavs": self.delete_output_wavs,
                    "rvc_settings": dataclasses.asdict(self.rvc_settings),
                    # ... other settings ...
                }

                with open(config_path, "w") as f:
                    json.dump(data, f, indent=4)

            except Exception as e:
                if backup_path.exists():
                    shutil.copy(backup_path, config_path)
                raise Exception(f"Failed to save config: {e}")

Fuller example.....

`from pathlib import Path
import json
import time
import shutil
import dataclasses
from filelock import FileLock
from dataclasses import dataclass
from fastapi import Response
import requests

@dataclass
class AlltalkConfigRvcSettings:
    rvc_enabled: bool = False
    rvc_char_model_file: str = "Disabled"
    pitch: int = 0
    # ... other RVC settings ...

    def validate(self):
        """Validate RVC settings"""
        if not -24 <= self.pitch <= 24:
            raise ValueError("Pitch must be between -24 and 24")
        # ... other validations ...

class AlltalkConfig:
    __instance = None
    __last_read_time = 0  # Track when we last read the file
    __file_check_interval = 1  # Check file every second
    __this_dir = Path(__file__).parent.resolve()

    def __init__(self):
        self.branding = ""
        self.delete_output_wavs = ""
        self.rvc_settings = AlltalkConfigRvcSettings()
        # ... other settings ...

    @staticmethod
    def get_instance():
        if AlltalkConfig.__instance is None:
            AlltalkConfig.__instance = AlltalkConfig()
            AlltalkConfig.__instance.__load_config()
        else:
            # Check if config file has been modified
            AlltalkConfig.__instance.__check_and_reload()
        return AlltalkConfig.__instance

    def __check_and_reload(self):
        """Check if config file has been modified and reload if needed"""
        current_time = time.time()
        if current_time - self.__last_read_time >= self.__file_check_interval:
            config_path = self.__this_dir / "confignew.json"
            try:
                mtime = config_path.stat().st_mtime
                if mtime > self.__last_read_time:
                    self.__load_config()
                    self.__last_read_time = current_time
            except Exception as e:
                print(f"Error checking config file: {e}")

    def __load_config(self):
        """Load config from JSON file"""
        configfile_path = self.__this_dir / "confignew.json"
        try:
            with open(configfile_path, "r") as configfile:
                data = json.load(configfile)

            # Copy fields into typed classes
            self.branding = data.get("branding", "AllTalk ")
            self.delete_output_wavs = data.get("delete_output_wavs", "Disabled")

            # Load RVC settings into dataclass
            rvc_data = data.get("rvc_settings", {})
            self.rvc_settings = AlltalkConfigRvcSettings(
                rvc_enabled=rvc_data.get("rvc_enabled", False),
                rvc_char_model_file=rvc_data.get("rvc_char_model_file", "Disabled"),
                pitch=rvc_data.get("pitch", 0),
            )
            self.rvc_settings.validate()

        except Exception as e:
            print(f"Error loading config: {e}")

    def reload(self):
        """Force a reload of the config from file"""
        self.__load_config()
        self.__last_read_time = time.time()

    def save(self):
        """Save config with file locking to prevent race conditions"""
        config_path = self.__this_dir / "confignew.json"
        lock_path = config_path.with_suffix('.lock')

        with FileLock(lock_path):
            # Create backup
            if config_path.exists():
                backup_path = config_path.with_suffix('.backup')
                shutil.copy(config_path, backup_path)

            try:
                data = {
                    "branding": self.branding,
                    "delete_output_wavs": self.delete_output_wavs,
                    "rvc_settings": dataclasses.asdict(self.rvc_settings),
                    # ... other settings ...
                }

                with open(config_path, "w") as f:
                    json.dump(data, f, indent=4)

            except Exception as e:
                if backup_path.exists():
                    shutil.copy(backup_path, config_path)
                raise Exception(f"Failed to save config: {e}")

I've never refactored in this way, so I'm fully open to your thoughts/guidance/code on this. I like the idea though, it would tidy up the code and help moving forward.

On a personal front, I am still backwards and forwards taking care of my family member, so will be on/off on here. That's going to remain an ongoing situation for a while.

Currently Im working on adding new TTS engines from the Features Request list https://github.com/erew123/alltalk_tts/discussions/74 (F5-TTS is done apart from some documentation, but that wont affect anything you are doing or touch the tts_server.py/script.py files). And otherwise, when I cant code etc, I'm potentially working on adding to the WIKI, so again that shouldn't affect anything you are looking at. Finally I will be attacking updating to a later version of PyTorch, which has been complicated a little by Microsoft changing how DeepSpeed is compiled/its requirements, so that's a slower work in progress, but again, shouldn't affect anything you are looking at.

Not sure what you think of my suggestions? What would you like me to do re this merge, just pull it in so you can work on things?

Thanks again for your input and help working on AllTalk :) and sorry for a long, rambling response.

Paladinium commented 3 weeks ago

@erew123 : Thanks for your input, I like it! And thanks for the hint about the new config options.

I wasn't aware of what script.py is doing, but it's a lot. I will first try to refactor the code except for script.py and then reach out to you for the rest.

Paladinium commented 3 weeks ago

@erew123 : Just for my understanding: why are the new gradio_pages settings not visible in confignew.json?

erew123 commented 3 weeks ago

@Paladinium Ah good point. I was being semi-lazy (or clever, not sure which) when I did this. I just made it create them dynamically when someone first changed them.

https://github.com/erew123/alltalk_tts/blob/alltalkbeta/script.py#L1674-L1686

Ive dropped an update into the standard config file for you, so they will be there now https://github.com/erew123/alltalk_tts/commit/c2e1095bb86572f2de1530f54d1d8138a92a9247

Thanks

Paladinium commented 3 weeks ago

@erew123 : with the first 3 commits, I tried to add all your suggestions. You may want to check whether this matches your ideas before I start rolling out the new classes to the code base.

Since you are busy, I don't expect a very long comment. Just let me know where you see room for improvements.

Paladinium commented 3 weeks ago

As a sidenote, I also wrote a unit test. This helped me a lot to not break the code when adding functionality...

Paladinium commented 2 weeks ago

@erew123 : I am quite far with refactoring tts_server.py. However, I stumbled over the backup logic implemented in:

From what I see, the backup logic is quite complex:

While I see the value of such a behavior, I also wonder whether it is really necessary to deal with all cases in such detail. E.g. most products just fail if the config is invalid.

If it was up to me, I would create and use a backup only on save, which is exactly what you proposed in a previous comment. This deals with the most important case. However, it fails in case the written file is invalid JSON format (which I think is acceptable).

Do you have a strong opinion about this or can I move ahead using the simplified backup/restore mechanism instead?

erew123 commented 2 weeks ago

Hi @Paladinium Sorry for not getting back to you yet, I've been heavily working on more work on the WIKI, dealing with general support requests and tightening up a few things in other bits of code, so not looked over the work you've done yet.

I've just written probably a bit too much information below, but it there is the code for all the backups, annotated and there is a full flow chart to go with it. I am easy with whatever you want to do, though below I will explain why the backups are complicated and maybe some other information you may have figured out or be missing. The flow-chart might be good to look at before the code.

Here is the explainer I wrote before writing the above

So the backup routine I only put in recently as the last month or so. I put it in because some people complained that their config file(s) was being corrupted somehow, yet no-one would give me a copy of their file, explain what they could have been doing or anything like that. Likewise I couldn't find any good reason that their files would be corrupted, so without anything to go on, I just decided to go overboard on config file backups/restoration etc. Let me try fill you in on a couple of other bits:

Upgrade/downgrade confignew.json file & backup the file

Links to the 2x files in this folder https://github.com/erew123/alltalk_tts/tree/alltalkbeta/system/config

In theory this can be used to add/remove items from peoples confignew.json files. confignew.json is excluded from being updated by a git pull and I wanted a routine that should there ever be a need to add/remove something from peoples config files, it could be done centrally by adding something to either at_configdowngrade.json or at_configupdate.json

Though Ive never used it yet..... but it avoids the possible issue where something is added to the code and there is a missing setting in a persons confignew.json file when they update.

Adding new TTS Engines & backing up its config.

So this logic uses these:

tts_engines_path = this_dir / "system" / "tts_engines" / "tts_engines.json"
new_engines_path = this_dir / "system" / "tts_engines" / "new_engines.json"
tts_engines_backup_path = tts_engines_path.with_suffix(".backup")  # Backup file path

and this is the more complicated one which would cause tts_server.py to fail.

Again, this was people telling me their config file was damaged....

So the tts_engines.json is managing the state of which tts engine is currently loaded in and what other tts_engines are available to be loaded in, along with a default OR last loaded model file. https://github.com/erew123/alltalk_tts/wiki/Configuration-File:-tts_engines.json. This file is not touched by a git pull and is protected otherwise we are changing peoples settings every time they update.

new_engines.json is for adding a new tts_engine. So I used this the other day when I added the F5-TTS engine in. You will see it in here https://github.com/erew123/alltalk_tts/blob/alltalkbeta/system/tts_engines/new_engines.json

The way that logic works is https://github.com/erew123/alltalk_tts/blob/alltalkbeta/script.py#L205 look at the 2x files and match up the name in the two files. If tts_engines.json already has that name within it, then just ignore it (which is why the new_engines.json can have XTTS & Parler listed in it and not actually do anything to your tts_engines.json, but anything that isn't listed in your tts_engines.json by name that is listed in new_engines.json, add it into the tts_engines.json and make it available for use in tts_server.py etc.

As such, in the state that tts_engines.json is corrupted completely, rebuilding the tts_engines.json as a basic file and F5-TTS not being in it wont matter, as it will get added back in when it looks at new_engines.json..

Here is also an annotated breakdown of the code, which should show up nicely coloured in something like visual studio (or whatever you use to edit):

You can dump this in/replace this block of code if you want https://github.com/erew123/alltalk_tts/blob/alltalkbeta/script.py#L69-L227

###########################################################
# SECTION 1: CONFIG FILE UPDATE SYSTEM FOR confignew.json
###########################################################
# Define paths for update configuration files for confignew.json
update_config_path = this_dir / "system" / "config" / "at_configupdate.json"     # Contains new settings to add
downgrade_config_path = this_dir / "system" / "config" / "at_configdowngrade.json"  # Contains settings to remove

def changes_needed(main_config, update_config, downgrade_config):
    """
    Determines if the main config needs updating by:
    1. Checking if any keys need to be removed (in downgrade_config)
    2. Checking if any new keys need to be added (in update_config)
    """
    # Check if any keys need to be removed
    for key in downgrade_config.keys():
        if key in main_config:
            return True
    # Check if any new keys need to be added
    for key, value in update_config.items():
        if key not in main_config:
            return True
    return False

def update_config(config_file_path, update_config_path, downgrade_config_path):
    try:
        # Load all configuration files
        with open(config_file_path, 'r') as file:
            main_config = json.load(file)          # The user's current config
        with open(update_config_path, 'r') as file:
            update_config = json.load(file)        # New settings to add
        with open(downgrade_config_path, 'r') as file:
            downgrade_config = json.load(file)     # Settings to remove

        # Only proceed if changes are actually needed
        if changes_needed(main_config, update_config, downgrade_config):
            # Create timestamped backup before making any changes
            timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
            backup_path = config_file_path.with_suffix(f".{timestamp}.bak")
            logging.info(f"Creating backup of the main config to {backup_path}")
            shutil.copy(config_file_path, backup_path)

            # Add new settings from update_config
            for key, value in update_config.items():
                if key not in main_config:
                    main_config[key] = value

            # Remove old settings listed in downgrade_config
            for key in downgrade_config.keys():
                if key in main_config:
                    del main_config[key]

            # Save the updated configuration
            with open(config_file_path, 'w') as file:
                json.dump(main_config, file, indent=4)

            print(f"[{branding}TTS] \033[92mConfig file update: \033[91mUpdates applied\033[0m")
        else:
            print(f"[{branding}TTS] \033[92mConfig file update: \033[93mNo Updates required\033[0m")

    except Exception as e:
        print(f"[{branding}TTS] \033[92mConfig file update: \033[91mError updating\033[0m")

# Execute the configuration update
update_config(config_file_path, update_config_path, downgrade_config_path)

###############################################################
# SECTION 2: TTS ENGINES MANAGEMENT SYSTEM for tts_engines.json
###############################################################
# Define paths for TTS engine configuration files
tts_engines_path = this_dir / "system" / "tts_engines" / "tts_engines.json"        # Current engine config
new_engines_path = this_dir / "system" / "tts_engines" / "new_engines.json"        # New engines to add
tts_engines_backup_path = tts_engines_path.with_suffix(".backup")                  # Backup location

def safe_load_json(file_path, backup_path=None):
    """
    Safely loads JSON files with error handling and backup restoration:
    1. Tries to load the original file
    2. If file is missing, tries to restore from backup
    3. If no backup, creates a new default configuration
    4. If file is corrupted, tries to restore from backup
    """
    try:
        # First attempt: Try to load the file normally
        with open(file_path, 'r') as f:
            return json.load(f)
    except FileNotFoundError:
        print(f"[{branding}TTS] File not found: {file_path}")

        if backup_path and os.path.exists(backup_path):
            # Second attempt: Try to restore from backup
            print(f"[{branding}TTS] Restoring from backup: {backup_path}")
            with open(backup_path, 'r') as f:
                data = json.load(f)
            # Save restored data back to original location
            with open(file_path, 'w') as f:
                json.dump(data, f, indent=4)
            print(f"[{branding}TTS] Restored {file_path} from backup.")
            return data
        else:
            # Third attempt: Create new default configuration
            print(f"[{branding}TTS] No backup found, creating a new default tts_engine JSON structure.")
            default_data = {
                "engine_loaded": "piper",              # Default engine to load
                "selected_model": "piper",             # Default model to use
                "engines_available": [                 # List of available engines
                    {
                        "name": "parler",
                        "selected_model": "parler - parler_tts_mini_v1"
                    },
                    {
                        "name": "piper",
                        "selected_model": "piper"
                    },
                    {
                        "name": "vits",
                        "selected_model": "vits - tts_models--en--vctk--vits"
                    },
                    {
                        "name": "xtts",
                        "selected_model": "xtts - xttsv2_2.0.3"
                    }
                ]
            }
            with open(file_path, 'w') as f:
                json.dump(default_data, f, indent=4)
            print(f"[{branding}TTS] Created new default file at: {file_path}")
            return default_data
    except json.JSONDecodeError as e:
        # Handle corrupted JSON files
        print(f"[{branding}TTS] JSON decoding error in file {file_path}: {e}")
        if backup_path and os.path.exists(backup_path):
            print(f"[{branding}TTS] Restoring from backup due to corrupted file: {backup_path}")
            with open(backup_path, 'r') as f:
                data = json.load(f)
            with open(file_path, 'w') as f:
                json.dump(data, f, indent=4)
            print(f"[{branding}TTS] Restored {file_path} from backup.")
            return data
        else:
            raise Exception(f"File {file_path} is corrupted and no backup is available.")

# Load the current TTS engines configuration
tts_engines_data = safe_load_json(tts_engines_path, tts_engines_backup_path)

# Load the list of new engines to potentially add
with open(new_engines_path, 'r') as f:
    new_engines_data = json.load(f)

# Get sets of current and new engines for comparison
current_engines = {engine['name'] for engine in tts_engines_data['engines_available']}
new_engines = new_engines_data['engines_available']

# Check each new engine to see if it should be added
for engine in new_engines:
    engine_name = engine['name']
    if engine_name not in current_engines:
        # Verify the engine's directory exists before adding
        engine_dir = this_dir / "system" / "tts_engines" / engine_name
        if engine_dir.is_dir():
            # Backup current configuration before modification
            shutil.copy(tts_engines_path, tts_engines_backup_path)

            # Add the new engine to the configuration
            tts_engines_data['engines_available'].append(engine)
            print(f"[{branding}TTS] \033[92mNew TTS Engine    : \033[91mAdded {engine_name}\033[0m")

            # Save the updated configuration
            with open(tts_engines_path, 'w') as f:
                json.dump(tts_engines_data, f, indent=4)
                print(f"{tts_engines_path} updated and saved successfully.")

####################################################
# SECTION 3: FINAL CONFIGURATION RELOAD
####################################################
# Reload the complete configuration with all updates
params = load_config(config_file_path)

Click the double arrow thing at the top right to show full screen.

flowchart TD
    Start[Program Start] --> ConfigCheck{Check Config Files}

    %% Config Update Process
    ConfigCheck --> LoadConfigs[Load confignew,json, update_config, and downgrade_config]
    LoadConfigs --> NeedChanges{Changes Needed?}
    NeedChanges -->|Yes| BackupConfig[Create Timestamped Backup.YYYYMMDDHHMMSS.bak]
    BackupConfig --> UpdateConfig[Apply Updates/Downgrades to confignew.json]
    NeedChanges -->|No| SkipConfig[Skip Config Update]

    %% TTS Engines Process
    ConfigCheck --> LoadTTSEngines{Load tts_engines.json}

    LoadTTSEngines -->|Success| CheckNew[Check new_engines.json for new engines]
    LoadTTSEngines -->|Fail| TryBackup{Backup Exists?}

    TryBackup -->|Yes| RestoreBackup[Restore from .backup file]
    TryBackup -->|No| CreateDefault[Create Default Configuration]

    CheckNew --> NewFound{New Engine Found?}
    NewFound -->|Yes| BackupTTS[Create .backup of tts_engines.json]
    BackupTTS --> AddEngine[Add New Engine to Configuration]
    NewFound -->|No| Complete

    RestoreBackup --> CheckNew
    CreateDefault --> CheckNew
    UpdateConfig --> Complete
    SkipConfig --> Complete
    AddEngine --> Complete

    Complete[Configuration Complete]

    subgraph "File Types"
        ConfigFiles["Configuration Files:
        - confignew.json (main config)
        - at_configupdate.json (additions)
        - at_configdowngrade.json (removals)
        - tts_engines.json (engine config)
        - new_engines.json (new engine definitions)"]
    end

    subgraph "Backup Types"
        BackupFiles["Backup Files:
        - .YYYYMMDDHHMMSS.bak (config changes)
        - .backup (TTS engine changes)"]
    end
Paladinium commented 2 weeks ago

@erew123 : Thanks for the explanations. I wonder where you find the time for all the things you are doing... Anyway, it really helped to see the full picture (e.g. I wasn't aware of new_engines.json).

Having thought about that for a while, I would like to come up with the following suggestion to make things simpler, yet avoiding the trouble some users had:

I would appreciate a quick response or just a thumbs up if you're fine with this.

erew123 commented 2 weeks ago

@Paladinium Literally burning the candle at both ends and not sleeping enough, is where I find time :/

Sorry this will be a messy answer, and apologies if I am somewhat re-covering ground. I guess there is a little more complexity/nuance you may need. This is the best I can recall at this time of day, without digging through all the code. Maybe have a look at this and a think and I should be available to respond quite quickly when you reply.

First off your bit starting As a baseline: When saving a config file, always create a backup in case the save operation fails that sounds fine to me :)

As for Regarding the TTS engines, I've written out a bit more of an explainer below covering some more detail on where/how the tts_engines.json and new_engines.json are used.... which may make them an absolute mess to re-work...... and I have no firm conclusion on this at this moment in time. Other than to say, it would be an ass to re-work...... not impossible, but an ass. It may well be easiest to keep them working as they are........ but heres the detail

So whatever is in engine_loaded and selected_model of the tts_engines.json file, this is what is used by tts_server.py to know which tts engine is is loading in and what model it is loading in.

within the tts_engines.json this is used to tell script.py what additional pages it can load in the `TTS Engines Settings" area:

{
    "engine_loaded": "piper",
    "selected_model": "piper",
    "engines_available": [
        {
            "name": "parler",
            "selected_model": "parler - parler_tts_mini_v0.1"
        },
        {
            "name": "piper",
            "selected_model": "piper"
        },
        {
            "name": "vits",
            "selected_model": "vits - tts_models--en--vctk--vits"
        },
        {
            "name": "xtts",
            "selected_model": "xtts - xttsv2_2.0.3"
        },
        {
            "name": "f5tts",
            "selected_model": "f5tts - f5tts_v1"
        }
    ]
}

image

image

Script.py uses or tts_engines.json

It loads in the full list of everything in the file for 2x purposes.:

tts_engines.json loader https://github.com/erew123/alltalk_tts/blob/alltalkbeta/script.py#L1327-L1339

Pulls in the following for use throughout gradio

        engines_available = [engine["name"] for engine in tts_engines_data["engines_available"]]
        engine_loaded = tts_engines_data["engine_loaded"]
        selected_model = tts_engines_data["selected_model"]

So lets say you create a directory for a new TTS engine called "NEWTTS" in the \alltalk_tts\system\tts_engines\NEWTTS folder.

When script.py looks at the tts_engines.json it uses the name field and:

https://github.com/erew123/alltalk_tts/blob/alltalkbeta/script.py#L1385-L1397 https://github.com/erew123/alltalk_tts/blob/alltalkbeta/script.py#L2245-L2251 https://github.com/erew123/alltalk_tts/blob/alltalkbeta/script.py#L1897 https://github.com/erew123/alltalk_tts/blob/alltalkbeta/script.py#L1631

As I recall, we update the tts_engines.json and then we push a reload command to tts_server.py, which in turn loads tts_engines.json` and specifically uses engine_loaded and selected_model to decipher which tts engine and model it is now to load/start up with.

    "engine_loaded": "piper",
    "selected_model": "piper",

So there is a reasonable bit goes on in Gradio. from using tts_engines.json

tts_script.py uses or tts_engines.json

So as mentioned, tts_server.py loads in tts_engines.json on start up/reload and it pulls performs the same procedure, extending the path's used to reach model_engine.py for that tts engine e.g. xtts, piper, parler etc.

\alltalk_tts\system\tts_engines\NEWTTS\model_engine.py \alltalk_tts\system\tts_engines\NEWTTS\model_settings.json

https://github.com/erew123/alltalk_tts/blob/alltalkbeta/tts_server.py#L98-L102 https://github.com/erew123/alltalk_tts/blob/alltalkbeta/tts_server.py#L171-L192

model_engine.py is a class file that gets imported as part of the tts_server.py script and extends its features/abilities to work with the specific chosen tts engine. model_settings.json tells AllTalk, API requests, Gradio settings page, Gradio generation page what features the model is capable of supporting and allows/denies requests accordingly.

the model_engine file for each TTS engine ALSO touches the tts_engine.json file....... so perhaps this is the killer for changing these files around.

a quick summary of the model_engine.py and what its doing is:

If you want to understand a base model and its files, I suggest looking at the template folder for adding a new engine as the base code that is needed, including any JSON handling by a model_engine.py script is in there https://github.com/erew123/alltalk_tts/tree/alltalkbeta/system/tts_engines/template-tts-engine

The current process is something like this.....

flowchart TD
    classDef file fill:#e6f3ff,stroke:#333
    classDef ui fill:#f9f9f9,stroke:#333
    classDef process fill:#e6ffe6,stroke:#333

    tts_engines["tts_engines.json (Main Config)"]:::file
    new_engines["new_engines.json (Engine Addition Config)"]:::file

    subgraph script["script.py Functions"]
        direction TB
        load_engines["Load Engine List L1327-1339"]:::process
        build_ui["Build Generate UI L1897"]:::process
        build_settings["Build Settings Pages L2245-2251"]:::process
        handle_swap["Handle Engine Swaps L1631"]:::process
    end

    subgraph ui["Gradio Interface"]
        direction TB
        generate_page["Generate TTS Page - Engine Selection - Model Selection"]:::ui
        settings_tabs["TTS Engines Settings - Individual Engine Tabs - Engine-specific Settings"]:::ui
    end

    subgraph server["tts_server.py Functions"]
        direction TB
        server_load["Load Config L98-102"]:::process
        import_engine["Import model_engine.py L171-192"]:::process
    end

    tts_engines --> load_engines
    load_engines --> build_ui
    load_engines --> build_settings

    build_ui --> generate_page
    build_settings --> settings_tabs

    generate_page --> handle_swap

    tts_engines --> server_load
    server_load --> import_engine

    new_engines -.-> |"Add New Engines"| tts_engines

    %% Annotations
    note_generate["Build Generate Page: Engine dropdown, Model selection, Voice settings"]
    note_settings["Build Settings Pages: One tab per engine, Load engine settings, Configure capabilities"]
    note_server["Server Functions: Load current engine, Import engine class, Handle generation"]

    note_generate --- build_ui
    note_settings --- build_settings
    note_server --- import_engine
Paladinium commented 2 weeks ago

@erew123 : Got it, thanks for all those insights, this really helps.

As far as I see, what I proposed would work. Maybe I have to be more clear:

tts_engines.json:

{
    "engine_loaded": "piper",
    "selected_model": "piper",
    "custom_engines_available": [
        {
            "name": "my new engine",
            "selected_model": "my new engine model"
        }
    ]
}

While default_tts_engines.json (currently new_engines.json) lists all models provided by you:

{
    "engines_available": [
        {
            "name": "parler",
            "selected_model": "parler - parler_tts_mini_v0.1"
        },
        {
            "name": "xtts",
            "selected_model": "xtts - xttsv2_2.0.3"
        },
        ...
    ]
}

The code loading and processing the configuration would then load both JSON files and merge custom_engines_available into engines_available in memory (and not by writing to any file). When accessing the TTS settings in code, you won't see this, e.g.:

tts_config = AlltalkTTSEnginesConfig.getInstance()
print(tts_config.get_engine_names_available()) # returns ["parler", "xtts", ..., "my new engine"]
print(tts_config.engine_loaded) # returns "piper" in this example
print(tts_config.selected_model) # returns "piper" in this example

In other words: The application code still gets today's information about TTS engines as if it was directly read from tts_engines.json. I am pretty confident that this works properly for all usages in scripts.py, tts_server.py, model_engine.py, etc.

HOWEVER: it involves changing quite a lot of code at once in this MR which I want to avoid. This means: As a first step and to move forward, I would keep all the JSON files as is including some additional logic as discussed in the previous comments.

erew123 commented 2 weeks ago

Hi @Paladinium Sorry for the 24 hour delay getting back to you. I was a little busy/caught up. Don't worry about the requirements files issues above, that was my doing. I've just replaced finetune with a completely new version I was working on and swapped out faster-whisper for openai-whisper, which meant I could also drop about 1GB off the install requirements, hence a couple of changes in the files.

Otherwise, should I be testing this now?

Thanks

Paladinium commented 2 weeks ago

@erew123 : The MR so far just contains the configuration classes without actual usage. Still working on it... I'll drop you a line when it's ready to be tested.

Paladinium commented 2 weeks ago

@erew123 : ready. As mentioned, I only used the new config classes for tts_server.py as first step. However, I will also change the other files you mentioned (script.py, firstrun.py, firstrun_tgwui.py, ...) in a separate MR (this one gets too big otherwise). Some notes:

But since you know the system in depth, you might have other testing ideas.

erew123 commented 1 week ago

@Paladinium Im doing a full fresh install/re-validation today, so Im aiming to test this at the end of that, when I know current code is working! Will (hopefully) feed back to you within the next 24 hours.

erew123 commented 1 week ago

Hi @Paladinium

Happy Sunday etc! :) I am going through it now. Generally speaking all seems ok bar this on first load (below with a code change). I'm working my way through updating the script.py file to work with your change. I've gotten the basic saving/loading of settings working, its just updating everything else to use config.xxxx, then Ill test again etc. This is mostly just an interim update, just to say Im looking at it :)

Traceback (most recent call last):
  File "E:\newtest\alltalk_tts\tts_server.py", line 66, in <module>
    load_config()
  File "E:\newtest\alltalk_tts\tts_server.py", line 55, in load_config
    config = AlltalkConfig.get_instance(force_reload)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\newtest\alltalk_tts\config.py", line 289, in get_instance
    AlltalkConfig.__instance = AlltalkConfig()
                               ^^^^^^^^^^^^^^^
  File "E:\newtest\alltalk_tts\config.py", line 283, in __init__
    self._load_config()
  File "E:\newtest\alltalk_tts\config.py", line 135, in _load_config
    self.__with_lock_and_backup(self.get_config_path(), False, __load)
  File "E:\newtest\alltalk_tts\config.py", line 165, in __with_lock_and_backup
    lock_path.unlink()
  File "E:\newtest\alltalk_tts\alltalk_environment\env\Lib\pathlib.py", line 1147, in unlink
    os.unlink(self)
FileNotFoundError: [WinError 2] The system cannot find the file specified: 'E:\\newtest\\alltalk_tts\\confignew.lock'

So I changed this to have a "if the FileNotFoundError then just keep going" type deal.... humm is that the best way to deal with that?

def __with_lock_and_backup(self, path: Path, backup: bool, callable: Callable[[], None]):
    lock_path = path.with_suffix('.lock')
    backup_path = None
    try:
        with FileLock(lock_path):
            # Create backup:
            if path.exists() and backup:
                backup_path = path.with_suffix('.backup')
                shutil.copy(path, backup_path)

            try:
                callable()
            except Exception as e:
                if backup_path and backup_path.exists():
                    shutil.copy(backup_path, path)
                raise Exception(f"Failed to save config: {e}")
    finally:
        # Cleanup lock and backup files:
        if lock_path.exists():  # Only try to delete if it exists
            try:
                lock_path.unlink()
            except FileNotFoundError:
                pass  # Ignore if file doesn't exist

        if backup and backup_path and backup_path.exists():
            try:
                backup_path.unlink()
            except FileNotFoundError:
                pass  # Ignore if file doesn't exist

I might have a couple of other functions that probably should be pushed into your code, but Ill let you know when Ive tested further and changed the main script.py. Of course, Ill be sure to post those up for you to get access to (theoretically within 24 hours!

Thanks

Paladinium commented 1 week ago

@erew123 : Thanks for looking at this. I now pushed firstrun.py, firstrun_tgwui.py and alltalk_welcome.py. I did not start to work on scripts.py, but I would have. However, if you would like to do it yourself (which might be a good idea to get familiar with the new classes), go ahead.

Regarding your question So I changed this to have a "if the FileNotFoundError then just keep going" type deal.... humm is that the best way to deal with that?: I would just add the if lock_path.exists() and drop the except since this should never happen. The reason why you got this problem and I don't are probably different OS implementations (I am using Linux, you probably Windows).

Here are some hints for you:

If you have trouble with scripts.py, just drop me a line and I take care of it in a separate MR. What matters to me most is that this change can be pushed to the beta branch before adding more code.

erew123 commented 1 week ago

@Paladinium probably be tomorrow some time before I get back to you with this. I've got it possibly 100% working now, but I need to re-verify everything and I might as well keep on with a bit of code tidy-up. I just spent 40 minutes figuring out the class/clazz thing with multiple things trying to save/change the file and a "clazz" appearing in the config file. All corrected now, but Ill be sending you a small update to the config.py too!.

Jeez, you've certainly done a good job with all this! Will get back to you when I can :)

Paladinium commented 1 week ago

@erew123 Ups, the clazz thing probably is this one. I was focused on reading the value, but forgot to write a test to write it back as class instead of clazz.

Looking forward to your update for the config.py.

erew123 commented 1 week ago

@Paladinium sorry its taking a while, but I am getting there! Im doing a code clean-up too because, well some gradio things have been annoying!

Part of that, that I decided is to add slightly better debug output and use a central print function. I will do the tts_server.py and other bits after we have your code imported. But you will be able to flag a few other debug settings. And you can now see what the last function the code was in too (if you want. Shown in green), so you can easily decipher where your issue is probably occurring, which should hugely speed up debugging.

image

What I was going to suggest is that I create a new branch on my repository, import your changes there, apply my changes there. You can sync a branch to yourself if you like and that will allow me/us to put changes in quickly and work on it. Ill be able to pull any further changes you make in immediately. When we are happy with whatever is in there, Ill push it up to the main branch. Does that sound like a good plan?

Thanks

erew123 commented 1 week ago

Second example (tts_server.py of course is not done yet). Just showing script.py tracking function use on a start-up

image

erew123 commented 1 week ago

re config.py In short, I had a few minor cockups that occurred when I added in some new fields e.g. debug_func. I corrupted the config file. So Ive made a changed to it that:

Added proper type hints and default values to all dataclass fields to ensure proper initialization Modified _handle_loaded_config to properly merge existing config data with default values Fixed to_dict method to correctly convert dataclass instances when saving

So now behaviour is:

If a field exists in confignew.json, its value is preserved If a field is missing, it's populated with the default value from the dataclass definition When saving, all fields (both existing and newly added) are properly written back to confignew.json

Which also solves another issue I/we had with adding new fields to peoples confignew.json if needed as that was in the script.py, which is now stripped out :)

You can find the main changes I made in the final def _handle_loaded_config function and the to_dict function. I left a debug_me in both those functions if you wanted to see what they are doing with the changes.

Ill drop it here, just so you can take a look.... hopefully you think all is ok?

import os
import json
import time
import inspect
import shutil
from pathlib import Path
from filelock import FileLock
from types import SimpleNamespace
from dataclasses import dataclass
from abc import ABC, abstractmethod
from typing import Callable, Any, MutableSequence

@dataclass
class AlltalkConfigTheme:
    file: str | None = None
    clazz = ""

@dataclass
class AlltalkConfigRvcSettings:
    rvc_enabled: bool = False
    rvc_char_model_file: str = "Disabled"
    rvc_narr_model_file: str = "Disabled" 
    split_audio: bool = True
    autotune: bool = False
    pitch: int = 0
    filter_radius: int = 3
    index_rate: float = 0.75
    rms_mix_rate: int = 1
    protect: float = 0.5
    hop_length: int = 130
    f0method: str = "fcpe"
    embedder_model: str = "hubert"
    training_data_size: int = 45000

@dataclass 
class AlltalkConfigTgwUi:
    tgwui_activate_tts: bool = True
    tgwui_autoplay_tts: bool = True
    tgwui_narrator_enabled: str = "false"
    tgwui_non_quoted_text_is: str = "character"
    tgwui_deepspeed_enabled: bool = False
    tgwui_language: str = "English"
    tgwui_lowvram_enabled: bool = False
    tgwui_pitch_set: int = 0
    tgwui_temperature_set: float = 0.75
    tgwui_repetitionpenalty_set: int = 10
    tgwui_generationspeed_set: int = 1
    tgwui_narrator_voice: str = "female_01.wav"
    tgwui_show_text: bool = True
    tgwui_character_voice: str = "female_01.wav"
    tgwui_rvc_char_voice: str = "Disabled"
    tgwui_rvc_char_pitch: int = 0
    tgwui_rvc_narr_voice: str = "Disabled"
    tgwui_rvc_narr_pitch: int = 0

@dataclass
class AlltalkConfigApiDef:
    api_port_number: int = 7851
    api_allowed_filter: str = "[^a-zA-Z0-9\\s.,;:!?\\-\\'\"$\\u0400-\\u04FF\\u00C0-\\u017F\\u0150\\u0151\\u0170\\u0171\\u011E\\u011F\\u0130\\u0131\\u0900-\\u097F\\u2018\\u2019\\u201C\\u201D\\u3001\\u3002\\u3040-\\u309F\\u30A0-\\u30FF\\u4E00-\\u9FFF\\u3400-\\u4DBF\\uF900-\\uFAFF\\u0600-\\u06FF\\u0750-\\u077F\\uFB50-\\uFDFF\\uFE70-\\uFEFF\\uAC00-\\uD7A3\\u1100-\\u11FF\\u3130-\\u318F\\uFF01\\uFF0c\\uFF1A\\uFF1B\\uFF1F]"
    api_length_stripping: int = 3
    api_max_characters: int = 2000
    api_use_legacy_api: bool = False
    api_legacy_ip_address: str = "127.0.0.1"
    api_text_filtering: str = "standard"
    api_narrator_enabled: str = "false"
    api_text_not_inside: str = "character"
    api_language: str = "en"
    api_output_file_name: str = "myoutputfile"
    api_output_file_timestamp: bool = True
    api_autoplay: bool = False
    api_autoplay_volume: float = 0.5

@dataclass
class AlltalkConfigDebug:
    debug_transcode: bool = False
    debug_tts: bool = False
    debug_openai: bool = False
    debug_concat: bool = False
    debug_tts_variables: bool = False
    debug_rvc: bool = False
    debug_func: bool = False

@dataclass
class AlltalkConfigGradioPages:
    Generate_Help_page: bool = True
    Voice2RVC_page: bool = True
    TTS_Generator_page: bool = True
    TTS_Engines_Settings_page: bool = True
    alltalk_documentation_page: bool = True
    api_documentation_page: bool = True

@dataclass
class AlltalkAvailableEngine:
    name = ""
    selected_model = ""

class AbstractJsonConfig(ABC):

    def __init__(self, config_path: Path | str, file_check_interval: int):
        self.__config_path = Path(config_path) if type(config_path) is str else config_path
        self.__last_read_time = 0  # Track when we last read the file
        self.__file_check_interval = file_check_interval

    def get_config_path(self):
        return self.__config_path

    def reload(self):
        self._load_config()
        return self

    def to_dict(self):
        # Remove private fields:
        without_private_fields = {}
        for attr, value in self.__dict__.items():
            if not attr.startswith("_"):
                without_private_fields[attr] = value
        return without_private_fields

    def _reload_on_change(self):
        # Check if config file has been modified and reload if needed
        if time.time() - self.__last_read_time >= self.__file_check_interval:
            try:
                most_recent_modification = self.get_config_path().stat().st_mtime
                if most_recent_modification > self.__last_read_time:
                    self.reload()
            except Exception as e:
                print(f"Error checking config file: {e}")

    def _load_config(self):
        self.__last_read_time = self.get_config_path().stat().st_mtime
        def __load():
            with open(self.get_config_path(), "r") as configfile:
                data = json.load(configfile, object_hook=self._object_hook())
            self._handle_loaded_config(data)
        self.__with_lock_and_backup(self.get_config_path(), False, __load)

    def _object_hook(self) -> Callable[[dict[Any, Any]], Any] | None:
        return lambda d: SimpleNamespace(**d)

    def _save_file(self, path: Path | None | str, default=None, indent=4):
        file_path = (Path(path) if type(path) is str else path) if path is not None else self.get_config_path()

        def custom_default(o):
            if isinstance(o, Path):
                return str(o)  # Convert Path objects to strings
            elif hasattr(o, '__dict__'):
                return o.__dict__  # Use the object's __dict__ if it exists
            else:
                raise TypeError(f"Object of type {type(o).__name__} is not JSON serializable")

        default = default or custom_default

        def __save():
            with open(file_path, 'w') as file:
                json.dump(self.to_dict(), file, indent=indent, default=default)

        self.__with_lock_and_backup(file_path, True, __save)

    def __with_lock_and_backup(self, path: Path, backup: bool, callable: Callable[[], None]):
        lock_path = path.with_suffix('.lock')
        backup_path = None
        try:
            with FileLock(lock_path):
                # Create backup:
                if path.exists() and backup:
                    backup_path = path.with_suffix('.backup')
                    shutil.copy(path, backup_path)

                try:
                    callable()
                except Exception as e:
                    if backup_path and backup_path.exists():
                        shutil.copy(backup_path, path)
                    raise Exception(f"Failed to save config: {e}")
        finally:
            # Cleanup lock and backup files:
            if lock_path.exists():  # Only try to delete if it exists
                try:
                    lock_path.unlink()
                except FileNotFoundError:
                    pass  # Ignore if file doesn't exist

            if backup and backup_path and backup_path.exists():
                try:
                    backup_path.unlink()
                except FileNotFoundError:
                    pass  # Ignore if file doesn't exist

    @abstractmethod
    def _handle_loaded_config(self, data):
        pass

class AlltalkNewEnginesConfig(AbstractJsonConfig):
    __instance = None
    __this_dir = Path(__file__).parent.resolve()

    def __init__(self, config_path: Path | str = os.path.join(__this_dir, "system", "tts_engines", "new_engines.json")):
        super().__init__(config_path, 5)
        self.engines_available: MutableSequence[AlltalkAvailableEngine] = []
        self._load_config()

    def get_engine_names_available(self):
        return [engine.name for engine in self.engines_available]

    @staticmethod
    def get_instance():
        if AlltalkNewEnginesConfig.__instance is None:
            AlltalkNewEnginesConfig.__instance = AlltalkNewEnginesConfig()
        AlltalkNewEnginesConfig.__instance._reload_on_change()
        return AlltalkNewEnginesConfig.__instance

    def _handle_loaded_config(self, data):
        self.engines_available = data.engines_available

    def get_engines_matching(self, condition: Callable[[AlltalkAvailableEngine], bool]):
        return [x for x in self.engines_available if condition(x)]

class AlltalkTTSEnginesConfig(AbstractJsonConfig):
    __instance = None
    __this_dir = Path(__file__).parent.resolve()

    def __init__(self, config_path: Path | str = os.path.join(__this_dir, "system", "tts_engines", "tts_engines.json")):
        super().__init__(config_path, 5)
        self.engines_available: MutableSequence[AlltalkAvailableEngine] = []
        self.engine_loaded = ""
        self.selected_model = ""
        self._load_config()

    def get_engine_names_available(self):
        return [engine.name for engine in self.engines_available]

    @staticmethod
    def get_instance(force_reload = False):
        if AlltalkTTSEnginesConfig.__instance is None:
            force_reload = False
            AlltalkTTSEnginesConfig.__instance = AlltalkTTSEnginesConfig()

        if force_reload:
            AlltalkTTSEnginesConfig.__instance.reload()
        else:
            AlltalkTTSEnginesConfig.__instance._reload_on_change()
        return AlltalkTTSEnginesConfig.__instance

    def _handle_loaded_config(self, data):
        # List of the available TTS engines:
        self.engines_available = self.__handle_loaded_config_engines(data)

        # The currently set TTS engine from tts_engines.json
        self.engine_loaded = data.engine_loaded
        self.selected_model = data.selected_model

    def __handle_loaded_config_engines(self, data):
        available_engines = data.engines_available
        available_engine_names = [engine.name for engine in available_engines]

        # Getting the engines that are not already part of the available engines:
        new_engines_config = AlltalkNewEnginesConfig.get_instance()
        new_engines = new_engines_config.get_engines_matching(lambda eng: eng.name not in available_engine_names)

        # Merge engines:
        return available_engines + new_engines

    def save(self, path: Path | str | None = None):
        self._save_file(path)

    def is_valid_engine(self, engine_name):
        return engine_name in self.get_engine_names_available()

    def change_engine(self, requested_engine):
        if requested_engine == self.engine_loaded:
            return self
        for engine in self.engines_available:
            if engine.name == requested_engine:
                self.engine_loaded = requested_engine
                self.selected_model = engine.selected_model
                return self
        return self

@dataclass
class AlltalkConfig(AbstractJsonConfig):
    __instance = None
    __this_dir = Path(__file__).parent.resolve()

    def __init__(self, config_path: Path | str = __this_dir / "confignew.json"):
        super().__init__(config_path, 5)
        self.branding = ""
        self.delete_output_wavs = ""
        self.gradio_interface = False
        self.output_folder = ""
        self.gradio_port_number = 0
        self.firstrun_model = False
        self.firstrun_splash = False
        self.launch_gradio = False
        self.transcode_audio_format = ""
        self.theme = AlltalkConfigTheme()
        self.rvc_settings = AlltalkConfigRvcSettings()
        self.tgwui = AlltalkConfigTgwUi()
        self.api_def = AlltalkConfigApiDef()
        self.debugging = AlltalkConfigDebug()
        self.gradio_pages = AlltalkConfigGradioPages()
        self._load_config()

    @staticmethod
    def get_instance(force_reload = False):
        if AlltalkConfig.__instance is None:
            force_reload = False
            AlltalkConfig.__instance = AlltalkConfig()

        if force_reload:
            AlltalkConfig.__instance.reload()
        else:
            AlltalkConfig.__instance._reload_on_change()
        return AlltalkConfig.__instance

    def get_output_directory(self):
        return self.__this_dir / self.output_folder

    def save(self, path: Path | str | None = None):
        self._save_file(path)

    def _handle_loaded_config(self, data):
        from dataclasses import fields, is_dataclass, asdict
        debug_me =False
        if debug_me:
            print("=== Loading Config ===")
            print(f"Initial data state: {vars(data)}")

        # Create new instances with defaults
        default_instances = {
            'debugging': AlltalkConfigDebug(),
            'rvc_settings': AlltalkConfigRvcSettings(),
            'tgwui': AlltalkConfigTgwUi(),
            'api_def': AlltalkConfigApiDef(),
            'theme': AlltalkConfigTheme(),
            'gradio_pages': AlltalkConfigGradioPages()
        }
        if debug_me:
            print("\nDefault values for each class:")
        for name, instance in default_instances.items():
            if debug_me:
                print(f"{name}: {asdict(instance)}")        
                # Show actual default values from dataclass
                print(f"Default values: {[(f.name, getattr(instance, f.name)) for f in fields(instance)]}")
            if hasattr(data, name):
                source = getattr(data, name)
                if debug_me:
                    print(f"Source data: {vars(source) if hasattr(source, '__dict__') else source}")

                for field in fields(instance):
                    if hasattr(source, field.name):
                        setattr(instance, field.name, getattr(source, field.name))
                    if debug_me:
                        print(f"Field {field.name}: {getattr(instance, field.name)}")

            setattr(self, name, instance)

        # Handle non-dataclass fields
        for n, v in inspect.getmembers(data):
            if hasattr(self, n) and not n.startswith("__") and not is_dataclass(type(getattr(self, n))):
                setattr(self, n, v)

        self.theme.clazz = data.theme.__dict__.get("class", data.theme.__dict__.get("clazz", ""))
        self.get_output_directory().mkdir(parents=True, exist_ok=True)

    def to_dict(self):
        from dataclasses import is_dataclass, asdict
        debug_me =False
        if debug_me:
            print("=== Converting to dict ===")
        result = {}

        for key, value in vars(self).items():
            if not key.startswith('_'):
                # print(f"\nProcessing {key}:")
                if is_dataclass(value):
                    # print(f"Dataclass value before conversion: {vars(value)}")
                    result[key] = asdict(value)
                    # print(f"Converted to dict: {result[key]}")
                elif isinstance(value, SimpleNamespace):
                    # print(f"SimpleNamespace value: {value.__dict__}")
                    result[key] = value.__dict__
                else:
                    # print(f"Regular value: {value}")
                    result[key] = value

        if 'theme' in result:
            if debug_me:
                print("\nProcessing theme:")
                print(f"Before class handling: {result['theme']}")
            result['theme']['class'] = self.theme.clazz
            result['theme'].pop('clazz', None)
            if debug_me:
                print(f"After class handling: {result['theme']}")
        if debug_me:
            print(f"\nFinal dict: {result}")           
        return result

Also, the very first thing I am having script.py do after the imports is this:

# Confguration file management for confignew.json
from config import AlltalkConfig, AlltalkTTSEnginesConfig, AlltalkNewEnginesConfig
def initialize_configs():
    """Initialize all configuration instances"""
    config = AlltalkConfig.get_instance()
    tts_engines_config = AlltalkTTSEnginesConfig.get_instance()
    new_engines_config = AlltalkNewEnginesConfig.get_instance()
    return config, tts_engines_config, new_engines_config

# Load in configs
config, tts_engines_config, new_engines_config = initialize_configs()
config.save()  # Force the config file to save in case it was missing new any settings

The save just makes sure that in the case of populating any new fields, the file is saved, so should for some reason any other bit of code want to look at confignew.json for something (which hopefully no other code will be other than centrally in future) well, at least everything would be in it and its fully populated. Ive also added a save just as you CRTL+C too.

Will build that branch soon. If youre happy with the above code and see this before Ive done anything and you just want to push the change into this PR, more than happy for that :)

Thanks

Paladinium commented 1 week ago

@erew123 : Thanks for making those improvements. I see what you want to achieve. Here are my 2 cents:

Is there anything I need to do now?

erew123 commented 1 week ago

@Paladinium I'm just cracking though a few other bits, which will be X hours I guess. Ill create the branch in a moment and pull your changes into that. Ill finish the code I'm doing and dump that in there. I will give it a from scratch download/test and confirm back to you to check/see what you think etc and pending where we get from there, merge that into the alltalkbeta branch.

I just didnt want to put it into alltalkbeta yet because people are downloading there around 70 clones a day atm, so Id rather have keep it in a separate branch atm before getting 70x "X didnt work" emails. Obviously, I don't want to lose your PR/commit showing up in the history too.

So no, nothing for you to do just now, but I will confirm back here later that Ive pushed everything in and its in a stable situation where we can properly test/final merge!

Thanks for bearing with me.

erew123 commented 1 week ago

@Paladinium saw your post on the other post. Just an update here... I am maybe 40-60 lines of code to check through, then I just need to re-validate a few things.. Im hoping to have something up in a few hours (god willing etc). Thanks for your patience :)

I will have the script.py & tts_server.py, config.py (with some extra bits in the debug section) and Ive managed to strip out all the code for the Text-gen-webui from the main script, so that it now imports from the Text-generation-webui remote extension (so 1x bit of code functioning in 2x different ways). And of course a massive code clean up.

I will still have some 100% validation checks on everything and then I will also tidy up things in the actual tts engines so they have better debug reporting (as they are a bit of a mess currently). But, debugging/error tracking should be pretty fast now across these scripts!! :)

Will get those bits of code upload as soon as I can and let you know!

Paladinium commented 1 week ago

@erew123 : Wow, amazing. Looking forward! I am not in a hurry though... do it at your own pace.

erew123 commented 1 week ago

@Paladinium Took a break, had some food and got back to it. Im sure there will be something I have missed, but it all checks out at the moment, so Ill upload my changes, do a full from scratch setup run and see how it goes. So far though it all check out! (yes those errors are meant to be there, I updated the test package I have to test errors and make sure the user doesnt get back a block of code). Ill upload that too so that people can use it to verify any code changes.

Will let you know how it all goes!

image

erew123 commented 1 week ago

Hi @Paladinium

I far as I can tell the new config.py is working great! :) Thanks so much for your work on it, it really has helped!

As far as I can tell, all is good!

Also my updated to the script.py and tts_server.py seem to be good!

Im just working on tidying up the documentation and gradio interface a bit now that things have been shifted around!

I assume you dont want the test folder in the main downloads anymore? As it was for testing....right

Unless you have anything else to add/change or wish to run your own test, then I think it would be ok to merge to alltalkbeta.....

Oh and re-the gradio interface/docker thing. I will take a look at that and see if I can figure what could be going wrong there!

Let me know!

Thanks again!

Paladinium commented 1 week ago

@erew123 : Well done, thanks! I made some quick tests on the configrework branch and it looks good to me.

Regarding your questions regarding the test folder: I would like to keep it! What I would like to do for you is to make a suggestion for using Github actions to automatically run tests.

If you want to maintain this project for a longer time, make larger changes and embrace other contributors, you won't make it without any tests. In fact, it would be nice if you also would start to write tests for functionality you add ;-). So if you don't mind, leave it for now and I'll try to provide code to execute Github actions for automated tests. If it doesn't work for some reason, we still can delete the folder.

erew123 commented 1 week ago

@Paladinium I was actually just about to reply again! I am still adding/changing a few little bits to script.py, to tidy up documentation, interface and that docker issue. tts_server.py should remain the same (as far as I can tell).

I did used to have a github actions (someone else made it), but the install requirements for a build went over what was free, but, it should now be under-size again now I've cleaned up the requirements quite a bit! I managed to drop at leat 1.5GB off the install when I re-wrote finetuning the other week (plus that will clear off x amount of cached bits during the install, so I think it will easily fit in the github actions testing thing again). It would be fantastic to have that again! :) Thanks so much for your help with this!

What I was going to ask you, is, do you want specific debugging flags that would help you with anything (docker etc)? I did earlier add a debug_gradio_ip, which will tell you what the gradio interface thinks it's doing if its in a docker or google colab build. Happy to add anything else if you think theres value? These are all the debug flags Ive added: https://github.com/erew123/alltalk_tts/wiki/AllTalk-V2-Debug-Options

Thanks

Paladinium commented 1 week ago

@erew123 I think the debug flags are fine for now, thanks.