twilio-samples / live-translation-openai-realtime-api

Integrate AI-powered voice translation into a Twilio Flex contact center using our prebuilt starter app, enabling live conversations between agents and customers speaking different languages.
MIT License
40 stars 14 forks source link

Can not figure out turn detection and interruption. #6

Open Surebob opened 1 month ago

Surebob commented 1 month ago

are there plans to implement a (correctly) working interruption mechanism?

jme783 commented 1 month ago

Hi @Surebob -- can you please describe the desired behavior vs. the behavior you're currently observing? Currently the app uses OpenAI's automatic server-side turn detection which is documented here

With translation between two humans + an AI, turn detection and interruption is actually a bit more nuanced when compared to 1 human + 1 AI.

If you think about the analog of having a human translator in a live conversation, often times Person A speaks their full utterance, the translator then translates once Person A is finished speaking, then Person B speaks followed by the translator translating back to Person A's language once Person B is done speaking. This behavior is how we modeled the app.

If you could give us a better sense of what you're looking for, perhaps that could help determine what's possible.