twilio-labs / call-gpt

Generative AI phone call toolkit using Twilio Media Streams.
MIT License
307 stars 132 forks source link

Realtime API OpenAI Speech to Speech Support #62

Open badereddineqodia opened 1 month ago

badereddineqodia commented 1 month ago

Realtime API OpenAI Speech to Speech Support

kevinmershon commented 3 weeks ago

For what it's worth, deepgram is ~18x cheaper than openai's realtime speech implementation currently.

badereddineqodia commented 3 weeks ago

But it depends on the use case. if we're talking about standard languages like English, French, or Spanish, it's fine to go with Deepgram or Azure STT. However, when dealing with different dialects, especially those like Moroccan Darija or Egyptian Arabic, it makes a big difference to opt for OpenAI's real-time API instead of relying solely on Deepgram. It's important to note that Deepgram's TTS doesn't fully support these dialects and STT not good at all.