Open clemlesne opened 4 months ago
I'm also interested in this question.
What about response time? What about costs? Can you stream data?
I know I know :) OpenAI APIs are not yet available:
Plus, Communication Services APIs are not yet available to use with raw audio stream.
If you have ideas, don't hesitate!
m
Audio streaming is now available with Communication Services!
Realtime API now support speech to speech from OpenAI. https://platform.openai.com/docs/guides/realtime/overview
I would like to explore more and add this feature to this project @clemlesne
We're working on it!
OpenAI GPT 4o model supports both in and out of text, image and audio. Understanding is finer than usual STT > model > TTS approach because the model has direct access to user behavior, emotions, etc.
Is there a way to use Communication Services and receive the raw audio flow, bypassing the STT step?