openai / openai-realtime-api-beta

Node.js + JavaScript reference client for the Realtime API (beta)
MIT License
714 stars 185 forks source link

Request: Enhance audio-text synchronization for RESPONSE_AUDIO_TRANSCRIPT_DELTA and RESPONSE_AUDIO_DELTA events #25

Open opchronatron opened 1 month ago

opchronatron commented 1 month ago

Objective: Improve the ability to align text and audio deltas for smoother playback and interruption handling. Proposed solutions (in order of preference):

Benefits: Enables graceful sentence completion before cutting off buffered audio from the previous turn. Improves overall user experience with more natural speech flow and interruptions.

This would make life much easier :)

khorwood-openai commented 1 month ago

I've forwarded this to the Realtime team! They can sync if there are any updates here.

Stevenic commented 1 month ago

Nice feature request... I suspect it'll be difficult for them todo as it doesn't look like there's a 1:1 mapping of audio to text deltas. They're close but i've seen different lengths...