JarodMica / StyleTTS-WebUI

MIT License
51 stars 18 forks source link

train.txt issue (Phonemizer false positive) #31

Closed shakenbake15 closed 1 month ago

shakenbake15 commented 2 months ago

Jarod,

I had trouble training a dataset after I had already successfully trained one. Phonemizer step was just saying success immediately without actually running. I found that the train.txt file did not have the "|0" at the end of each line. When I added this to the file, it ran fine. I'm wondering if this is unique to me and possibly the result of something I did to my install, or if you have a guess as to why this happened on my most recent training data, but not my first training file.

P.S. If you have time to respond, I'd appreciate it. I have a crudely modified version of your audiobook maker working with this webui. First results were good, but I do notice that sometimes the speaker will slow down or speed up randomly. Is that a style tts thing, or possibly my fine-tune model needs some work? I am still investigating, could be due to a need for more strict sentence length rules. Note: With StyleTTS I'm getting no hallucinations or repeat words like with Tortoise. So, that's a positive.

JarodMica commented 1 month ago

If for some reason the speakerID checkbox is off, it won't add it: image

Should be checked on by default, I kinda just left it there from the tortoise gui but since style tts needs it, I guess I'll hide it actually

As for slowing down and speeding up, that seems to be a quirk or styleTTS. You can play around with the sigma value in the config file to kinda help with that.

shakenbake15 commented 1 month ago

Thanks. After training a new voice and tweaking the text splitting, it's working much better. Not exactly sure which thing was to blame for the speed up and slow down issues I was having. Probably the long sentences were the issue, but I don't 100% know for certain.