Open bondio77 opened 3 weeks ago
Sorry for the late reply. I think you should train from scratch for the following reasons:
Example: hop length = 300 -> 5 5 4 * 3 https://github.com/kan-bayashi/ParallelWaveGAN/blob/86740373ec609cb9fb192d472d2aea125041491a/egs/vctk/voc1/conf/hifigan.v1.yaml#L40
thank you so much for answering me i will try your recommend. thank you!
Hi, first of all, thank you so much for providing pre-trained models through many experiments. But what I want to ask is, I want to fine-tune the pre-trained VCTK model with my multi-speaker dataset. In the VCTK config file, fft_size = 2048, hop_length = 300, win_length = 1024, but the config of the TTS model I trained is 1024, 256, 1024. When fine-tuning, will it work if I change the config file to 1024, 256, 1024 to match my TTS model? The sampling rate is 24000. Thank you!