NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
13.52k stars 3.22k forks source link

[FastPitch] train non-english #975

Open rave974 opened 3 years ago

rave974 commented 3 years ago

for non-english, is it required non-english tacotron pretrained model?

alancucki commented 3 years ago

Hi @rave974 , sorry for a late reply.

For the best results with FastPitch v1.0, you'll need a Tacotron 2 model for every speaker.

With FastPitch v1.1 you no longer have the dependency on Tacotron 2. Keep in mind, that by default it uses phoneme input and has only English pronunciation dictionary.

vonLeebpl commented 3 years ago

With FastPitch v1.1 you no longer have the dependency on Tacotron 2. Keep in mind, that by default it uses phoneme input and has only English pronunciation dictionary.

Does it mean data set transcript needs to be in Arpabet 'language'? Or to have it trained for polish voice for instance I need to add polish phenom dictionary to FastPitch 1.1 scripts (if possible), like these included in common/text subdirectory?

darkalfx commented 2 years ago

Would also like to know about this in FastPitch 1.1