Open RuchirB opened 9 months ago
You could use the gpt-4-turbo-preview
(or 3.5
) GPT model for a small boost.
i am acheieving same latency i am using Groq for faster access but still the latency is 4 tell me the best way like for STT adn TTS it staking more time the STT is taking a time of 1.5 and tts a time of 1.2 please help me out the best configuraiton for the deepgram and wahtever it is how you are getting 1 by god sake?
Tried out the project, very impressed. Thanks for open sourcing. Quick question on latency.
Noticed a minimum latency of at least 3-4s. I am measuring latency as delay between when the human speaks and when the AI responds. This was with everything deployed on fly.io in Ashburn using the exact demo as instructed.
Looks like the biggest bottleneck is the request from Twilio -> Fly.io and Fly.io -> Twilio. Second biggest bottleneck looks like transcription via deepgram.
The ReadMe suggests a latency of 1s—can you clarify the definition of latency here? Is that just looking at gpt response + TTS?
Any ideas on how to reduce latency? Is there a roadmap for this project we can follow somewhere?
@RuchirB Same case here...I am using Grok instead openai GPT models but still experiencing some delay b/w receiving audio packets from twilio & human speaking...
I think using OpenAI's real-time API now is perfect, as it eliminates the need for additional middle services that would add more latency.
I created an application using the OpenAI real-time API with function call and twilio integration, but it seems too rigid and robotic. Additionally, I find it too expensive to be feasible for real-world use, at least for now.
I think using OpenAI's real-time API now is perfect, as it eliminates the need for additional middle services that would add more latency.
Tried out the project, very impressed. Thanks for open sourcing. Quick question on latency.
Noticed a minimum latency of at least 3-4s. I am measuring latency as delay between when the human speaks and when the AI responds. This was with everything deployed on fly.io in Ashburn using the exact demo as instructed.
Looks like the biggest bottleneck is the request from Twilio -> Fly.io and Fly.io -> Twilio. Second biggest bottleneck looks like transcription via deepgram.
The ReadMe suggests a latency of 1s—can you clarify the definition of latency here? Is that just looking at gpt response + TTS?
Any ideas on how to reduce latency? Is there a roadmap for this project we can follow somewhere?