Closed unilight closed 4 years ago
It's nice to integrate PWG/MelGAN. Actually, I have already discuss it with @kan-bayashi. He said he can create recipe and pre-trained model after releasing vcc2020 dataset.
For the structure, I think following is nice.
egs/vaevc/template/run.sh
egs/vaevc/<recipe>/local/download_pretrained_neuralvocoder.sh
to download pre-trained models.crank/bin/generate_wav_{pwg,melgan}.py
to generate wav file w/ pre-trained model and generated h5 in stage 5.I see. The structure looks okay with me. How about a recipe for training the neural vocoders? Also, if okay, I can implement the recipes first and have you and kan-bayashi revise it.
Training neural vocoder is out of scope of this repository. You can do either contribute to kan-bayashi/ParallelWaveGAN
or train and upload pre-trained model anywhere. I think he will help you to contribute PWG recipe. Of course I can revise source codes for crank repo.
I see. I will train vocoders in kan-bayashi/ParallelWaveGAN
and just provide pretrained model links in this repo. I will work on it next.
Let me know when you trained neural vocoder.
I have trained a PWG for VCC2018. I will send a PR later.
Let's discuss neural vocoder support.
pip install -U parallel_wavegan
.voc_expdir
to load the pretrained model. In addition, with PWG, kan-bayashi has also packed training code in the package, so we can provide recipes for users to train their own vocoders if they want. One example design can be likeegs/pwg/vcc2018
.