huggingface / parler-tts

Inference and training library for high-quality TTS models.
Apache License 2.0
2.6k stars 265 forks source link

Zero-Shot Voice Cloning #31

Open fakerybakery opened 4 weeks ago

fakerybakery commented 4 weeks ago

Hi, I know this library is primarily for text -> voice but do you know if it would be possible to modify it to accept a speaker embedding and perform zero-shot voice cloning? Thanks!

digisomni commented 3 weeks ago

Yep, this is what I am after as well. Bark did this, if we have something like that then using this for an assistant becomes 10x easier since any personality can be inserted into it.

ylacombe commented 3 weeks ago

Hey @fakerybakery, thanks for opening the discussion! The current design is a choice, and we're currently discussing internally if adding zero-shot voice cloning makes sense!

johnwick123f commented 2 days ago

+1, it would be very useful for many things. Parler tts sounds very good and it would be great to support cloning voices