twilio-labs / call-gpt

Generative AI phone call toolkit using Twilio Media Streams.
MIT License
323 stars 135 forks source link

1s Latency Definitiion #14

Open RuchirB opened 9 months ago

RuchirB commented 9 months ago

Tried out the project, very impressed. Thanks for open sourcing. Quick question on latency.

Noticed a minimum latency of at least 3-4s. I am measuring latency as delay between when the human speaks and when the AI responds. This was with everything deployed on fly.io in Ashburn using the exact demo as instructed.

Looks like the biggest bottleneck is the request from Twilio -> Fly.io and Fly.io -> Twilio. Second biggest bottleneck looks like transcription via deepgram.

The ReadMe suggests a latency of 1s—can you clarify the definition of latency here? Is that just looking at gpt response + TTS?

Any ideas on how to reduce latency? Is there a roadmap for this project we can follow somewhere?

ansario commented 8 months ago

You could use the gpt-4-turbo-preview (or 3.5) GPT model for a small boost.

ANIL-KADURKA commented 4 months ago

i am acheieving same latency i am using Groq for faster access but still the latency is 4 tell me the best way like for STT adn TTS it staking more time the STT is taking a time of 1.5 and tts a time of 1.2 please help me out the best configuraiton for the deepgram and wahtever it is how you are getting 1 by god sake?

devsalman247 commented 3 months ago

Tried out the project, very impressed. Thanks for open sourcing. Quick question on latency.

Noticed a minimum latency of at least 3-4s. I am measuring latency as delay between when the human speaks and when the AI responds. This was with everything deployed on fly.io in Ashburn using the exact demo as instructed.

Looks like the biggest bottleneck is the request from Twilio -> Fly.io and Fly.io -> Twilio. Second biggest bottleneck looks like transcription via deepgram.

The ReadMe suggests a latency of 1s—can you clarify the definition of latency here? Is that just looking at gpt response + TTS?

Any ideas on how to reduce latency? Is there a roadmap for this project we can follow somewhere?

@RuchirB Same case here...I am using Grok instead openai GPT models but still experiencing some delay b/w receiving audio packets from twilio & human speaking...

badereddineqodia commented 1 month ago

I think using OpenAI's real-time API now is perfect, as it eliminates the need for additional middle services that would add more latency.

boxed-dev commented 3 weeks ago

I created an application using the OpenAI real-time API with function call and twilio integration, but it seems too rigid and robotic. Additionally, I find it too expensive to be feasible for real-world use, at least for now.

I think using OpenAI's real-time API now is perfect, as it eliminates the need for additional middle services that would add more latency.