Open nick-harder opened 4 months ago
Thanks @nick-harder this will require some abstraction to switch and configure the models (and add/remove them dynamically alongside model providers, as one could have multiple openai set up for instance).
Code is welcome here, I'm not sure I can tackle this on the short term as I'm implementing the full multimodal pipeline now.
Why Having to register additionally with ElevenLabs for voice generation is a hussle. Also, they don't provide "pay as you go" plans. OpenAI has pretty good speech generation model TTS-1, and it can be used directly with the openAI API key, thus simplifying the setup process and the comfort of use.
Description Use OpenAI TTS model by default if OpenAI API key is provided. Make ElevenLabs optional if API key is added, and allow to select which one to use (similar to the image generation selection).