rescomputer / res-ios

Res is real computer speech, talk with an AI that listens and can hold a conversation with you.
https://res.computer
38 stars 1 forks source link

Cut the audio the SECOND a person speaks #7

Open ricburton opened 6 months ago

ricburton commented 6 months ago

Idea from Faraaz: like when u receive voice from the microphone just cut in-app vol to 0 before going back to previous value

mikejonas commented 5 months ago

Just brainstorming some thoughts -

The challenge here is to stop the audio as fast as possible while still distinguishing if is saying words as a part of a new prompt. Background noise or if the user is making sounds like "mmhmm", "Yeah", "huh", "ok" could lead to interrupting the output when it's not intended.

Another challenge would be detecting if sound that intends to silence the output is coming from app or from the user. The API itself is handling this. To try to do this on the client side might be surprisingly difficult.

I think it would be easier to tune this experience if vapi can update their api. You'd want:

lantos1618 commented 4 months ago

I've added a parallel audio input to demo a audio wave in branch wave-function-test. This could be used to cut the speech almost instantly when the amplitude is > 0.8