Closed neurlang closed 1 year ago
Hello, I've did the required changes to integrate tts cube with melgan vocoder.
The inference is really fast, with wavenet it took 1.5 hour to vocode a few sentences, now it takes literally seconds.
I trained the melgan vocoder for 325 epochs (106170 iterations), it took a day I think and it's already understandable.
Problem is encoder takes so long to train I'm training for days and days and it still says what it wants. I wish a more faster gpu.
It speaks, just not exactly what is in the text file.
The datasets are Japanese and Russian. I want to do a common (multi lingual) model in the future (just for fun).
Is there an interest from others to reproduce my experiment on your dataset?? I can share my code.
This no longer applies
Hello, I've did the required changes to integrate tts cube with melgan vocoder.
The inference is really fast, with wavenet it took 1.5 hour to vocode a few sentences, now it takes literally seconds.
I trained the melgan vocoder for 325 epochs (106170 iterations), it took a day I think and it's already understandable.
Problem is encoder takes so long to train I'm training for days and days and it still says what it wants. I wish a more faster gpu.
It speaks, just not exactly what is in the text file.
The datasets are Japanese and Russian. I want to do a common (multi lingual) model in the future (just for fun).
Is there an interest from others to reproduce my experiment on your dataset?? I can share my code.