anonymous-pits / pits

PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor
https://anonymous-pits.github.io/pits/
MIT License
274 stars 34 forks source link

Phonemes for training #27

Open JoanisTriandafilidi opened 11 months ago

JoanisTriandafilidi commented 11 months ago

Hello! If I understand correctly, then the g2p tool that uses the symbol_set specified in your repository cannot be obtained, right? You say that you can use a different kind of phonemes for training - this is understandable. But it is not clear how the pretrain model can be used to train other speakers.

I planned to do the following (maybe this is a stupid idea, please correct me if this is the case):

I have three speakers that I planned to train. I took the pretrain model, renamed the last three speakers in the config (p374, p376, s5) to my own. Processed the train data with g2p_en, getting phonemic text. However, it turned out that the symbol set g2p_en is very different from what is presented in the repository. Thus, it turns out that if I change the symbol set in text/symbols.py to the one that suits me, then I will not be able to use your pretrain model, since it learned from other phonemes and I can only learn the model from scratch. Did I understand correctly? Or is there some other way?

junjun3518 commented 7 months ago

If you want to use g2p_en, you just need to remove stress from it. eg. AH2 -> AH. Without stress, phoneme set is identical to that of g2p_en or cmudict.