openai / openai-realtime-api-beta

Node.js + JavaScript reference client for the Realtime API (beta)
MIT License
704 stars 178 forks source link

[QUESTION] Is it possible to make the WS start replying before the audio input finishes? #24

Open git-mb-back opened 1 month ago

git-mb-back commented 1 month ago

Hi, I'm playing with the application and I'm trying to create a Realtime translation agent. For Realtime I mean truly Realtime.

When someone starts talking, I need instantly to get the translation of what this person said, without waiting the end of the input stream (for both Push To Talk & VAD).

For the translation prompt, I already succeeded.

From your knowledge, do you think that this stuff could be achieved with OpenAi Realtime API?

Thanks.

PS: I know that starting instantly the translation could be very bad for it's quality, but imagine the case of a real conversation from persons, waiting for the other person to finish talk and start this ping-pong of waits its terrible in terms of UX.

khorwood-openai commented 1 month ago

Hey! Have flagged to the Realtime team. Will respond with updates if / when they're available.