Open BirgerMoell opened 3 years ago
@patrickvonplaten This is a suggestion but there are several models available and I think the best first step would be to look into getting a Text-To-Speech model working.
I explored the Real-Time-Voice-Cloning the other day and noticed it had several issues (since the project is no longer maintained) so it might be good to look into other speech models.
Here are some examples of repos that might be useful.
Hey @BirgerMoell - thanks a lot for the links I will take a look soon :-)
@BirgerMoell Thank you for resource sharing. I also want to add TransformerTTS to the list since it makes more sense to me to have transformers involved :P
I'd love to see this addition to huggingface though
I think it'd make a lot of sense to add FastSpeech2 to the library - happy to help with a PR if someone is interested. See: https://github.com/huggingface/transformers/pull/11135
Also, we started integrating https://github.com/as-ideas/TransformerTTS to the model hub so that people have easier access to TensorflowTTS models :-)
https://huggingface.co/tensorspeech/tts-fastspeech2-baker-ch
Hello To avoid duplication, I just wanted to check if anyone is working on this or if this is still relevant. If someone is still needed for this, I will be interested to take this up.
🌟 New model addition
Model description
Generalized End-To-End Loss for Speaker Verification implements Real time voice cloning, a way to generate a Text-To-Speech model adapted to a certain speaker with a short audio sample. The model implements the following paper. https://arxiv.org/pdf/1806.04558.pdf and the code is available on github.
https://github.com/CorentinJ/Real-Time-Voice-Cloning
Open source status
The model can be run through Colaboratory. Here is an example of a generated voice. https://soundcloud.com/birger-mo-ll/generated-voice
encoder.load_model(project_name / Path("encoder/saved_models/pretrained.pt")) synthesizer = Synthesizer(project_name / Path("synthesizer/saved_models/logs-pretrained/taco_pretrained")) vocoder.load_model(project_name / Path("vocoder/saved_models/pretrained/pretrained.pt"))