yl4579 / PL-BERT

Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
MIT License
211 stars 36 forks source link

2 Questions #32

Closed francqz31 closed 9 months ago

francqz31 commented 9 months ago

1- I wanted to add PL-BERT to https://github.com/huawei-noah/Speech-Backbones/tree/main/Grad-TTS How can I do that ? which files should I modify ?

2- I also wanted to ask if there is a better dataset than Wikipedia anywhere around ! Thanks in advance!

yl4579 commented 9 months ago
  1. You will need to replace this https://github.com/huawei-noah/Speech-Backbones/blob/main/Grad-TTS/model/text_encoder.py#L305 with PL-BERT model.
  2. Since it shall be trained on texts, any text corpus that is closely related to what your downstream TTS datasets would be suitable. Since most publicly available TTS corpus is audiobook reading, Wikipedia is the best publicly available corpus for training, but you can definitely train on other text corpus if you know what you want for your downstream TTS tasks.