wqt2019 / tacotron-2_melgan

tacotron-2(pytorch) + melgan(pytorch) chinese TTS
MIT License
26 stars 6 forks source link

tacotron-2(tensorflow) + melgan(pytorch) chinese TTS:

melgan is very faster than other vocoders and the quality is not so bad. re-implement the split_func in tacotron2 that tensorflow serving not support , re-implement the nn.ReflectionPad1d that tensorrt not support. modify the melgan's input from [-12,2] to [-4,4] that match the tacotron2's output.

python37,biaobei chinese dataset,tacotron2 support chinese pinyin or chinese phone + rhythm training(default is phone + rhythm),edit symbols.py and text.py

pinyin:
000001,ka2 er2 pu3 pei2 wai4 sun1 wan2 hua2 ti1
000002,jia2 yu3 cun1 yan2 bie2 zai4 yong1 bao4 wo3
000003,bao2 ma3 pei4 gua4 bo3 luo2 an1 diao1 chan2 yuan4 zhen3 dong3 weng1 ta4

phone + rhythm(dictionary.txt):
000001,k a2 er2 p u3 #2 p ei2 uai4 s uen1 #1 uan2 h ua2 t i1 #4 。
000002,j ia2 v3 c uen1 ian2 #2 b ie2 z ai4 #1 iong1 b ao4 uo3 #4 。
000003,b ao2 m a3 #1 p ei4 g ua4 #1 b o3 l uo2 an1 #3 , d iao1 ch an2 #1 van4 zh en3 #2 d ong3 ueng1 t a4 #4 。

tacotron2

melgan

Training and Inference:

gta:

real mel:

also ,run inference_melgan.py if you only interested in vocoder .

reference:

https://github.com/Rayhane-mamah/Tacotron-2
https://github.com/seungwonpark/melgan