Since it shall be trained on texts, any text corpus that is closely related to what your downstream TTS datasets would be suitable. Since most publicly available TTS corpus is audiobook reading, Wikipedia is the best publicly available corpus for training, but you can definitely train on other text corpus if you know what you want for your downstream TTS tasks.
1- I wanted to add PL-BERT to https://github.com/huawei-noah/Speech-Backbones/tree/main/Grad-TTS How can I do that ? which files should I modify ?
2- I also wanted to ask if there is a better dataset than Wikipedia anywhere around ! Thanks in advance!