Closed 311-code closed 8 months ago
Hi @brentjohnston
Ill move this over to the Discussions area.
With the 2.0.3 model, when they first released it, my god did it sound terrible https://github.com/coqui-ai/TTS/discussions/3306
I'm not sure if they messed something up with the models configuration file. However. I believe they did eventually correct this....eventually.
Id welcome people to play with it and give feedback. If enough people give feedback that its overall better, I can swap the base model on initial setup of AllTalk.
You can actually download the \extensions\alltalk_tts\models\xttsv2_2.0.3
folder and change the path in the settings page.
If people want to test out and feed back, Im happy to listen.
Thanks
A bit different than erew123's chat findings from back in November here, and not sure why. I just tried to train over the coqui v2 2.0.3 model and got less "feeling they are reading from a script" when voice talks.
Accuracy similar or better and better flow. I'd recommend trying it out again with method below just to make sure, link here: https://huggingface.co/coqui/XTTS-v2/tree/v2.0.3
For anyone here that wants to try: Backup the files in
\extensions\alltalk_tts\models\xttsv2_2.0.2
download all the new files and put in there. Train over base again.Not sure if this was the difference: I merged 3 of the most accurate samples from the previous trainings in voices folder. Merged them into one file as a .wav and placed in the
\extensions\alltalk_tts\finetune\put-voice-samples-in-here
folder, and used that for finetuning (in addition to the original dataset) then used that same merged .wav file for the voice selection.