erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
1.15k stars 118 forks source link

Conqui v2 2.0.3 sounding better somehow. #122

Closed 311-code closed 8 months ago

311-code commented 8 months ago

A bit different than erew123's chat findings from back in November here, and not sure why. I just tried to train over the coqui v2 2.0.3 model and got less "feeling they are reading from a script" when voice talks.

Accuracy similar or better and better flow. I'd recommend trying it out again with method below just to make sure, link here: https://huggingface.co/coqui/XTTS-v2/tree/v2.0.3

For anyone here that wants to try: Backup the files in \extensions\alltalk_tts\models\xttsv2_2.0.2 download all the new files and put in there. Train over base again.

Not sure if this was the difference: I merged 3 of the most accurate samples from the previous trainings in voices folder. Merged them into one file as a .wav and placed in the \extensions\alltalk_tts\finetune\put-voice-samples-in-here folder, and used that for finetuning (in addition to the original dataset) then used that same merged .wav file for the voice selection.

erew123 commented 8 months ago

Hi @brentjohnston

Ill move this over to the Discussions area.

With the 2.0.3 model, when they first released it, my god did it sound terrible https://github.com/coqui-ai/TTS/discussions/3306

I'm not sure if they messed something up with the models configuration file. However. I believe they did eventually correct this....eventually.

Id welcome people to play with it and give feedback. If enough people give feedback that its overall better, I can swap the base model on initial setup of AllTalk.

You can actually download the \extensions\alltalk_tts\models\xttsv2_2.0.3 folder and change the path in the settings page.

If people want to test out and feed back, Im happy to listen.

Thanks