rishikksh20 / AdaSpeech

AdaSpeech: Adaptive Text to Speech for Custom Voice
Apache License 2.0
156 stars 41 forks source link
adaspeech fastspeech fastspeech2 pytorch pytorch-implementation speech speech-synthesis text-to-speech transformer tts

AdaSpeech: Adaptive Text to Speech for Custom Voice [WIP]

Unofficial Pytorch implementation of AdaSpeech.

Note:

Citations

@misc{chen2021adaspeech,
      title={AdaSpeech: Adaptive Text to Speech for Custom Voice}, 
      author={Mingjian Chen and Xu Tan and Bohan Li and Yanqing Liu and Tao Qin and Sheng Zhao and Tie-Yan Liu},
      year={2021},
      eprint={2103.00993},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

Requirements :

All code written in Python 3.6.2 .

For Preprocessing :

filelists folder contains MFA (Motreal Force aligner) processed LJSpeech dataset files so you don't need to align text with audio (for extract duration) for LJSpeech dataset. For other dataset follow instruction here. For other pre-processing run following command :

python nvidia_preprocessing.py -d path_of_wavs

For finding the min and max of F0 and Energy

python compute_statistics.py

Update the following in hparams.py by min and max of F0 and Energy

p_min = Min F0/pitch
p_max = Max F0
e_min = Min energy
e_max = Max energy

For training

 python train_fastspeech.py --outdir etc -c configs/default.yaml -n "name"

Note