Closed AppalachianWine closed 4 years ago
sound good! how long inference one wav?
The above problems should be caused by my insufficient training, but there are still some other issues. And I think the speed of Melgan is extremely fast! I spent 0.0079s synthesizing 2s of audio in my GPU.
it's faster than G&L!
I experience the same thing with LJspeech. There is a sci-fi like noise at the point of silence.
Yes, it ’s about this kind of noise, and it may not be completely cleaned.
but I only experience it with predicted Spectrograms meaning usgin melgan with Tacotron. Maybe adding some noise in training might correct it.
That ’s right, now I ’m not sure if it ’s caused by data or a model.Perhaps this is a solution
@AppalachianWine Have you solved the above issue? I trained this melgan with Biaobei dataset and had the same kind of noise in silence segments.
Hello, thank you very much for the good work! I use Chinese datasets for experiments,and I found some noise in the gap position, May I ask if this is the best result? this is samples syn.zip