Closed LEECHOONGHO closed 2 years ago
Hi, thanks for your interest.
To address the issue in this demo, you could try more training steps with more GPUs (the End-to-End TTS model typically require a large batch size for better convergence). Besides, for better quality, sampling with more denoising steps is recommended.
Thank you for your reply.
I'm sharing my theta loss for reference.
Hello, Thank you for sharing your code with community. I'm trying to implement FastDiff-TTS model with my dataset.
My model pronounce well after 120k learning, but the sound quality is not good yet. So, I have some question for FastDiff-TTS's tendency.
The Audio Sample of my model is at the url below. https://lime-honeycrisp-5e3.notion.site/Multi-speaker-FastDiff-TTS-5bae38d4562144059bf84651f603ff28
Thank you.