run-llama / openai_realtime_client

A simple client and utils for interacting with OpenAI's Realtime API in Python
MIT License
125 stars 24 forks source link

feature request: exampes += --list-devices | --input-device | --output-device flags #2

Open gwpl opened 1 week ago

gwpl commented 1 week ago

feature request: exampes += --list-devices | --input-device | --output-device flags

When trying streaming example, scripts seems to "detect speech" after each chunk of sound it voices out, and interrupting itself, even I try to run it with headphones. Therefore I consider that maybe it is using wrong input device or something else is going on. Also manual input seems very confused and not working correctly.

Platform: Linux, in fresh venv.

WikiLucas00 commented 1 week ago

Same issue here, it seems there is no echo cancellation done (to cancel the output signal on the input one).

A threshold could be used in order to prevent sending input chunks with low-intensity voice (which would in most cases come from the output signal being heard by the microphone).

Did you try lowering the output intensity in your headset to make sure the microphone doesn't catch anything @gwpl?

I managed to chat with the model by muting my microphone whenever I'm not speaking, but that's not a viable solution.

tanvithakur94 commented 3 days ago

In addition to using microphone, this change has also helped in fine tuning the audio response and avoid any interruptions. - https://github.com/run-llama/openai_realtime_client/pull/5.