ming024 / FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
MIT License
1.69k stars 515 forks source link

How do I align my data #208

Open hkeliang opened 10 months ago

hkeliang commented 10 months ago

What about your own data

melodyze-ai commented 9 months ago

https://montreal-forced-aligner.readthedocs.io/en/latest/

Lakhjeet1082 commented 1 week ago

Use mfa or phonemizer to generate pronounciation dictionary .Then validate your corpus and check for OOV words using ( mfa validate ~/mfa_data/my_corpus ~/mfa_data/my_dictionary.txt ) this command. If the pretrained acoustic model for your language is not available then use ( mfa train ~/mfa_data/my_corpus ~/mfa_data/my_dictionary.txt ~/mfa_data/new_acoustic_model.zip ) use this command to train acoustic model. Then use (mfa align ~/mfa_data/my_corpus english_us_arpa english_us_arpa ~/mfa_data/my_corpus_aligned) mfa align command to generate textgrids. And then you are ready to use Fastspeech2 . For more info you can refer https://montreal-forced-aligner.readthedocs.io/en/latest/first_steps/index.html this documentation