Closed kjjd84 closed 1 week ago
OpenAI provided a few options for server_vad
, you might try these first (see turn_detection
). Check activation_threshold
first, but silence_duration_ms
will factor in as well.
Beyond that, VAD is complicated – it's not something we could robustly address in this code. If you turn off the OpenAI version, you might try a third party provider, or start by manually interrupting based on the Google TTS transcript (use response.cancel
and the new code which demonstrates conversation truncation).
Closing as this is out of scope.
the recent vad patch is nearly unusable
the ai is constantly interrupted by nothing, the call quality is now extremely static-ridden
i am using google text to speech to get a transcript of what the person is saying, and often times it is picking up words that are not even said, like the static is making the app think someone is talking
im not sure whats going on, but i can no longer even consider this for a real application