Closed riproskaie closed 2 years ago
@ZDisket can you help him, seems 50 - minute dataset is not enough for tacotron ?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Hello, I've been working on implementing an Esperanto fork of TensorflowTTS, but for the past few days, I haven't been so successful. My 6k-trained model spits random scrapped sounds (about 46 seconds long) from my audio input. I gave the model a bigger dataset, and it now talks nonsense for about 5~10 seconds for my 15-character input string. This is much shorter, but I'm not sure I can call it an improvement.
My current Esperanto dataset is 50-minute long. I made sure the cleaner is processing my strings right, and there were no typos in my metadata. Should I keep training the model for additional n-k steps, or do I need more audio recordings?
Here is my Tensorboard: