mediatechlab / tts-wrapper

TTS-Wrapper makes it easier to use text-to-speech APIs by providing a unified and easy-to-use interface.
MIT License
18 stars 9 forks source link

[WIP] Add get_voices, ElevenLabs, Refactor synth, speak method, streaming and various bug fixes #25

Closed willwade closed 3 months ago

willwade commented 1 year ago

This is a monster, mega, possibly too large a PR.

the key aim is NOT to break functionality in the original wrapper. But we have added to it a LOT. Any changes now we are just going to bug fix rather than look at new features..

Types of changes

willwade commented 6 months ago

Note; there is something really annoying about the synth method that is documented. Im not sure im dealing with it right

In the docs it states

tts.synth('<speak>Hello, world!</speak>', 'hello.mp3', format='mp3)

this actually isnt possible. I think it might have been a typo of sorts because i think its meant to be

tts.synth_to_file('<speak>Hello, world!</speak>', 'hello.mp3', format='mp3)

So - what I've done is made a synth method to use like it was documented in abstract

    def synth(self, text: str, filename: str, format: Optional[FileFormat] = "wav"):
        """
        Synthesizes text to speech and directly saves it to a file. Alias
        """
        self.synth_to_file(text,filename,format)

I'll be honest this grates the hell out of me. Because of course most of the engines have a synth method themselves. I fear about confusion. Its one of the reasons I have introduced speak and speak_streamed to help move away from this. But thoughts welcome

willwade commented 3 months ago

Sorry - totally forgot I left this PR open. Closing it - If I was mediatechlab I wouldnt accept this PR as its too massive. Drop me a line if you want to pick it back up though..