Azure-Samples / aoai-realtime-audio-sdk

Azure OpenAI code resources for using gpt-4o-realtime capabilities.
MIT License
589 stars 91 forks source link

Will this SDK be unified with the "non-realtime" API in the future? #25

Closed hayescode closed 2 weeks ago

hayescode commented 2 weeks ago

First of all, this repo is extremely helpful. It's clear this is a whole new paradigm using websockets. I assume as this matures towards GA, many of the normal "non-realtime" features like Assistants API, image modality, etc. will be added.

For us on the outside it would help us plan our products roadmap for this eventuality. Should we expect this realtime websocket approach to continue being separate (and therefore add toggles to our apps to switch between this realtime SDK and the existing API), or is the plan to unify these so that in the future we can create Assistant threads and the message content could be text, image, or audio for example?

Thanks again for the team's work on this repo, I would be very lost without this!

fajri-droid commented 2 weeks ago

🔥🔥🔥

trrwilson commented 2 weeks ago

Thanks, @hayescode -- this is a great question that belongs in the FAQ (in fact, I'll go update it shortly).

The specifics and timeline for full feature convergence (with respect to the official libraries at https://github.com/openai/openai-python and https://github.com/openai/openai-node) are still TBD. As was discussed in the "What's next" section of the Realtime API announcement post, integrated library support in Python and JS are on the roadmap, but an ETA is still pending. The .NET library (https://github.com/openai/openai-dotnet) already includes preview support in the latest release.

In the interim, the standalone libraries provided in this repository for Python and JavaScript will continue to be improved and maintained; whether it's for a week or six months, we want JS and Python customers using /realtime to have an easier entry point into the complexities of the protocol than a simple WebSocket wrapper can provide. We have no plans to proactively merge other endpoints together given the committed nature of the official libraries subsuming the need for standalone libraries, but we'll continually reevaluate how things are shaping up and where the gaps are -- along with how quickly we can converge separate libraries.

hayescode commented 2 weeks ago

Thank you @trrwilson for the detailed response! Makes sense!