huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.61k stars 26.92k forks source link

Text2Speech classes #16159

Closed avacaondata closed 2 years ago

avacaondata commented 2 years ago

Hey, it would be very cool if there were classes already developed for carrying out Text2Speech tasks. We do have many classes capable of doing Speech2Text, but for the opposite task there are no transformer classes available, although there are starting to be models in the hub: https://huggingface.co/facebook/tts_transformer-es-css10

It would be great if instead of using fairseq for loading those models, we could load them and fine-tune them further with Transformers.

LysandreJik commented 2 years ago

Pinging @anton-l for knowledge :)

anton-l commented 2 years ago

Hi @alexvaca0, thanks for your interest! We're currently working on implementing FastSpeech2 natively: https://github.com/huggingface/transformers/pull/15773 And looking into FastPitch as well: https://github.com/huggingface/transformers/issues/16349

Feel free to subscribe to those PRs/issues to follow the progress :)