DigitalPhonetics / IMS-Toucan

Multilingual and Controllable Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart.
Apache License 2.0
1.38k stars 155 forks source link

Finetune_example_simple doesn't work!!! #191

Open AlexSteveChungAlvarez opened 5 days ago

AlexSteveChungAlvarez commented 5 days ago

In recipes/finetune_example_simple. When you use this script as base for finetuning on only 1 language, it only outputs "EPOCH COMPLETE" in the terminal as an infinite loop, but never actually trains. image image

Usually for finetuning I'm taking about 20 minutes in the same computer and with a lot more data. This was a test on a very little simple dataset.

Flux9665 commented 5 days ago

It does work, I just tried it. You probably have fewer samples in your dataset than are required to build one batch. Reduce the batchsize until it's less than the amount of samples in your dataset.

AlexSteveChungAlvarez commented 4 days ago

They were 40 datapoints (english) and the batch size was by default 12. I tried it before too on a larger dataset (spanish) and didn't work neither. In that moment I thought it was because of the device, since it didn't have many resources, but this time I tried on a good computer.

Flux9665 commented 4 days ago

I will try again, maybe I was on a different branch while testing. I will update you soon

Flux9665 commented 4 days ago

I made a fresh clone and ran a test dataset with 500 datapoints. It works with no problems. I cannot reproduce your problem, so I need more details.

AlexSteveChungAlvarez commented 4 days ago

So tried again and it runs now, but it seems to be degrading at some point: step 3: image step 375: image

AlexSteveChungAlvarez commented 4 days ago

Should I let it run the 5k steps and see what it gives?

Flux9665 commented 4 days ago

So there was no problem with the finetune_example_simple?

If you have only 40 datapoints then 5000 steps is too much. The more datapoints you have and the cleaner they are, the longer you can run.

AlexSteveChungAlvarez commented 4 days ago

I don't know what may have been the problem, but in previous runs only an infinite printed message loop appeared. Now it works. I will try stopping it before step 375 which is the first degraded spectrogram.

AlexSteveChungAlvarez commented 4 days ago

The audios are in high quality, though, so I don't know why the model gets worse instead of clearer or remain clear.