Open RaymondTsao opened 3 years ago
Hi,
Same we are facing for arabic, may we have an update on this?
Thanks, Muhammad Ajmal Siddiqui
Hi,
Same we are facing for Vietnamese?
Thanks, Thuy Tran
Describe the bug
I already trained a English & Chinese bilingual tacotron model with my own data on following source: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2
And inference output is OK, my inference phrase like following: ji2-jiang1 wei4-nin2 bo1-fang4 pau yi4-shu4 wu3-dao4 pau gu3-ba1 dang1-dai4 wu3-dao4-tuan2 pau DH-AX0 S-AE1-K-R-AO0-L D-AE1-N-S then the output file I can understand what it say
Then I need speed up inference time, I try following the steps to export my own tacotron model to onnx & tensorRT in _trtiscpp folder: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2/trtis_cpp and could transfer my model successfully.
But inference output I can't understand what it saying, like alien language. :( I found can add possible syllable in model-config/tacotron2waveglow/mapping.txt , but I added all syllable and rebuild again, the inference output audio file still sound nonsense.
So, is it possible support non-english language like chinese in trtis_cpp? Any files could modify to do this?
@RaymondTsao
I would like to train an English & Chinese bilingual TTS model, how is the format of your datasets?
Your part of inference phrase like wu3-dao4
, does it mean 舞蹈
in Chinese?
If true, how do I transform all the text into this form?
And the DH-AX0 S-AE1-K-R-AO0-L D-AE1-N-S
is the phoneme of English?
@R7788380
hi, I have my own chinese & english parser and dictionary to do it. So you may can make mandarin text to symbols like by python module name pinyin, but output without parser information.
yeah, DH-AX0 S-AE1-K-R-AO0-L D-AE1-N-S is the phoneme of English, you can do it by CMUdict.
@R7788380
hi, I have my own chinese & english parser and dictionary to do it. So you may can make mandarin text to symbols like by python module name pinyin, but without parser information.
yeah, DH-AX0 S-AE1-K-R-AO0-L D-AE1-N-S is the phoneme of English, you can do it by CMUdict.
Thank you very much for your reply! I will try it.
Describe the bug
I already trained a English & Chinese bilingual tacotron model with my own data on following source: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2
And inference output is OK, my inference phrase like following: ji2-jiang1 wei4-nin2 bo1-fang4 pau yi4-shu4 wu3-dao4 pau gu3-ba1 dang1-dai4 wu3-dao4-tuan2 pau DH-AX0 S-AE1-K-R-AO0-L D-AE1-N-S then the output file I can understand what it say
Then I need speed up inference time, I try following the steps to export my own tacotron model to onnx & tensorRT in _trtiscpp folder: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2/trtis_cpp and could transfer my model successfully.
But inference output I can't understand what it saying, like alien language. :( I found can add possible syllable in model-config/tacotron2waveglow/mapping.txt , but I added all syllable and rebuild again, the inference output audio file still sound nonsense.
So, is it possible support non-english language like chinese in trtis_cpp? Any files could modify to do this?