Open ricburton opened 6 months ago
Just brainstorming some thoughts -
The challenge here is to stop the audio as fast as possible while still distinguishing if is saying words as a part of a new prompt. Background noise or if the user is making sounds like "mmhmm", "Yeah", "huh", "ok" could lead to interrupting the output when it's not intended.
Another challenge would be detecting if sound that intends to silence the output is coming from app or from the user. The API itself is handling this. To try to do this on the client side might be surprisingly difficult.
I think it would be easier to tune this experience if vapi can update their api. You'd want:
I've added a parallel audio input to demo a audio wave in branch wave-function-test. This could be used to cut the speech almost instantly when the amplitude is > 0.8
Idea from Faraaz: like when u receive voice from the microphone just cut in-app vol to 0 before going back to previous value