TensorSpeech / TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
https://tensorspeech.github.io/TensorFlowTTS/
Apache License 2.0
3.84k stars 815 forks source link

where are some of the procedures described in the fastspeech2 paper #770

Closed binbinxue closed 2 years ago

binbinxue commented 2 years ago

I read the paper and i was going through the code here. In the paper they've used montreal forced alignment to extract the phoneme durations, i did not see this being aligned in the code and extracted for training. Also some techniques like CWT for extracting pitch? the preprocessing code here simply used pyworld to extract the pitch.