Closed blueprintparadise closed 10 months ago
Hello! Thank you for your interest in 🍵 Matcha-TTS.
This is a great idea, however I do not have a straightforward interface to do it currently. But you can pass two additional lists one of speaker IDs and a list of tuples of their phone boundaries. Later once you pass speaker IDs through the speaker encoder and obtain speaker representations, instead of, broadcasting it for all x, just broadcast only for the specific word boundaries here
and in the decoder.
Hope this helps :)
Hello, I am assuming you are happy with the answer for now, please feel free to reopen the issue and continue the discussions in case of any further points.
Could you provide some resources for a long-form speech generation code that allows for switching between multiple speakers within the same text similar to what you did in the youtube video.