Cut the audio the SECOND a person speaks

ricburton commented 6 months ago

Idea from Faraaz: like when u receive voice from the microphone just cut in-app vol to 0 before going back to previous value

mikejonas commented 5 months ago

Just brainstorming some thoughts -

The challenge here is to stop the audio as fast as possible while still distinguishing if is saying words as a part of a new prompt. Background noise or if the user is making sounds like "mmhmm", "Yeah", "huh", "ok" could lead to interrupting the output when it's not intended.

Another challenge would be detecting if sound that intends to silence the output is coming from app or from the user. The API itself is handling this. To try to do this on the client side might be surprisingly difficult.

I think it would be easier to tune this experience if vapi can update their api. You'd want:

An interruption threshold parameter. 0 could be conservative and 1 could be aggressive, stopping the voice output almost immediately.
Ability to pause the current output, while still listening to the user for new prompts. That way if there's noise that may be the start of a new prompt, you can pause the output for a brief moment and not just silence it, so there's a normal flow to continue if it decides it's not a new prompt

lantos1618 commented 4 months ago

I've added a parallel audio input to demo a audio wave in branch wave-function-test. This could be used to cut the speech almost instantly when the amplitude is > 0.8

rescomputer / res-ios

Cut the audio the SECOND a person speaks #7