erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
816 stars 91 forks source link

Config page provides no way to load a fine-tuned model #171

Closed Artem-B closed 4 months ago

Artem-B commented 4 months ago

https://github.com/erew123/alltalk_tts/blob/fff01107c89cea9e7f0405e247da0f39637dc957/system/at_admin/at_settings.html#L249

Similarly there's no way to select finetuned model via SillyTavern extension.

If there's an existing way to load finetuned model and make it available in SillyTavern, I didn't manage to find it. Updating documentation would be helpful.

erew123 commented 4 months ago

Hi @Artem-B

I've had a quick look and it must be something that has changed. I know the way variables are saved changed in ST so maybe its related. Certainly AT is sending over to ST that a Finetuned model is available:

image

Its just not opening the option in ST now. As I say Ill have to take a deeper dive on this one and try see what's occurring with that.

As a temporary workaround for you, as long as you have the Finetuned model stored it the trainedmodel folder, it should be detected on AllTalk startup:

image

In a command prompt/terminal, you can manually force a load of the model with:

curl -X POST "http://127.0.0.1:7851/api/reload?tts_method=XTTSv2%20FT"

That would at least get you 80% of the way there.

As I say, Ill have to take a look at what ST is or isnt doing that stopping it presenting the option on the dropdown list.

Thanks

Artem-B commented 4 months ago

Finetuned model is detected, but it does not get loaded via ST UI.

[AllTalk Startup] AllTalk Github updated : 6th April 2024 at 10:45
[AllTalk Startup] Finetuned model        : Detected
[AllTalk Startup] TTS Subprocess         : Starting up
[AllTalk Startup]
[AllTalk Startup] AllTalk Settings & Documentation: http://127.0.0.1:7851
[AllTalk Startup]
[AllTalk Model] XTTSv2 Local Loading xttsv2_2.0.2 into cuda
[AllTalk Model] Coqui Public Model License
[AllTalk Model] https://coqui.ai/cpml.txt
[AllTalk Model] Model Loaded in 7.87 seconds.
[AllTalk Model] Ready

Manually loading finetuned model with curl works. Thank you for the hint.

AFAICT, the ST extension never sets tts_method=XTTSv2%20FT All I can see in chrome console is http://localhost:7851/api/reload?tts_method=XTTSv2%20Local.

erew123 commented 4 months ago

Hi @Artem-B

I managed to figure what it was and have fixed it:

image

If you go here: https://github.com/erew123/alltalk_tts/blob/main/system/st_files/alltalk.js and click the download button:

image

And you will need to save that file over the top of alltalk.js in \SillyTavern\public\scripts\extensions\tts

image

All should be working! Ill have to send a PR over to ST to get it updated there at some point.

Thanks

Artem-B commented 4 months ago

https://github.com/erew123/alltalk_tts/blob/fff01107c89cea9e7f0405e247da0f39637dc957/system/st_files/alltalk.js#L284

Looks like we're comparing with a string 'true' instead of a boolean true here.

Update: looks like it's exactly what you've fixed. :-)

Artem-B commented 4 months ago

I can confirm that XTTSv2 FT is now visible in ST, and that finetuned model gets loaded when it's selected.