AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
You'll see two scripts. compare_and_merge.py and expand_xtts.py.
I didn't do any integration with alltalk so these scripts are capable of running as is, standalone.
steps to use
Run alltalk finetune and check the bpe tokenizer box to train a new tokenizer during transcription
begin transcription
When transcription is complete you will have a bpe_tokenizer-vocab.json
Open compare_and_merge.py and fill in the file paths for the base model files and the new vocab.
run compare_and_merge.py
You now have an expanded_vocab.json.
Open expand_xtts.py and fill in the file paths
Run expand_xtts.py
You now have an expanded base xttsv2 expanded_model.pth and its pair expanded_vocab.json
The base xttsv2 model needs to be removed from the file path /alltalk_tts/models/xtts/xttsv2_2.0.3/model.pth
The base vocab.json needs to be removed from the file path /alltalk_tts/models/xtts/xttsv2_2.0.3/vocab.json
Place xpanded_model.pth and expanded_vocab.json in the place of the removed base models at path /alltalk_tts/models/xtts/xttsv2_2.0.3/. Rename them to model.pth and vocab.json.
Thats it you can now begin fine tuning as is.
You'll find each file commented with more detail about whats going on. I also switched the script to use a rotating port because when working on cloud instances specifically it's very common that you exit the script and the port stays open for awhile causing an open port issue. If we rotate the ports then it avoids having to manually go in and change the port each time. To bo honest I accidentally pushed with that change in there. feel free to toss it out if its beyond the scope of this PR or not something you wish to include.
I just realized I pushed some scripts that just had changes in it from running the env. I had just woken up lol, considering how you havent seen this yet. I'm going to close it and send a clean PR.
You'll see two scripts. compare_and_merge.py and expand_xtts.py.
I didn't do any integration with alltalk so these scripts are capable of running as is, standalone.
steps to use
You'll find each file commented with more detail about whats going on. I also switched the script to use a rotating port because when working on cloud instances specifically it's very common that you exit the script and the port stays open for awhile causing an open port issue. If we rotate the ports then it avoids having to manually go in and change the port each time. To bo honest I accidentally pushed with that change in there. feel free to toss it out if its beyond the scope of this PR or not something you wish to include.