Closed Moon-sung-woo closed 2 months ago
In paper they said they use Mel spec, while vits1 uses linear spec. Am I missing something? 🧐
In paper they said they use Mel spec, while vits1 uses linear spec. Am I missing something? 🧐
So is there a big difference between the effects of these two conditions?
I think better to use linear spectrogram. But it's not a big deal
@p0p4k Oh! i'm sorry. I read the paper wrong. Thank you
Hi I'm sungwoo Moon. First of all, thank you for your sharing your code.
I'm looking at your code and I'm wondering why you use 'use_mel_posterior_encoder'. In the paper vits1, it says that we use spectrogram like vits1, but I wonder if there is a difference in TTS performance.
Thank you.