Official Implement of MSMC-TTS System of papers
The latest MSMC-TTS (MSMC-TTS-v2) is optimized with a MSMC-VQ-GAN based autoencoder combining MSMC-VQ-VAE and HifiGAN. The multi-stage predictor is still applied as the acoustic model to predict MSMCRs for TTS synthesis.
[2024.04.10] (In Progress) The implementation of QS-TTS is available at examples/csmsc/configs
[2022.10.20] We release the latest version of MSMC-TTS (MSMC-TTS-v2) based on MSMC-VQ-GAN. Please refer to our latest paper "Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Representation"
[2022.10.18] We will release the code of all versions of MSMC-TTS in this repo. And anyone interested in this work is welcome to join us to explore more useful speech representations for speech synthesis.
[2022.9.22] "A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS" is published at INTERSPEECH 2022.
# Install
pip -r requirements.txt
# Train (Take the example of CSMSC, please refer to the example of CSMSC to prepare your training data)
python train.py -c examples/csmsc/configs/msmc_vq_gan.yaml
python train.py -c examples/csmsc/configs/msmc_vq_gan_am.yaml
# Multi-GPU Training
python train_dist.py -c examples/csmsc/configs/msmc_vq_gan.yaml
python train_dist.py -c examples/csmsc/configs/msmc_vq_gan_am.yaml
# Test -- Analysis-Synthesis
python infer.py -c examples/csmsc/configs/msmc_vq_gan.yaml -m examples/csmsc/checkpoints/msmc_vq_gan/model_800000 -t examples/csmsc/data/test_ae.yaml -o analysis_synthesis
# Test -- TTS-Synthesis
python infer.py -c examples/csmsc/configs/msmc_vq_gan_am.yaml -m examples/csmsc/checkpoints/msmc_vq_gan_am/model_200000 -t examples/csmsc/data/test_tts.yaml -o tts
Help you better train your models!
@inproceedings{guo2022msmc,
title={A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS},
author={Guo, Haohan and Xie, Fenglong and Soong, Frank K and Wu, Xixin and Meng, Helen},
booktitle={Proc. INTERSPEECH},
year={2022}
}