Closed G-Thor closed 2 weeks ago
Firstly, I apologize for not sharing the training process and checkpoint for bigV_16k. Currently, it is challenging to make the vocoder publicly available. Instead, you can download the pre-trained model checkpoint from HiFi-GAN and adjust the parameters accordingly, which should allow you to proceed easily.
We plan to either release the checkpoint for bigV_16k in the future or modify the code to support training with HiFi-GAN directly.
Thank you for your understanding.
Thanks for your reply! I'll try it out with HiFiGAN instead
I am attempting to train a model using your code.
In the base config file used in model training (e.g. when calling
sh train_run.sh
), a pretrained vocoder model seems to be required. https://github.com/Choddeok/EmoSphere-TTS/blob/88c237bc039e3d2e08476b9961bbf3fb94db89ec/egs/egs_bases/tts/base.yaml#L44-L45 However, the README doesn't mention this, nor can I seem to find any explanation about where this model should come from. I could not find any reference to a 16kHz pretrained model on the official BigVGAN repo. Could you shed some light on where this pretrained model may be found or how it was trained in the first place?Thanks in advance!