Open clemlesne opened 1 month ago
I'm also interested in this question.
What about response time? What about costs? Can you stream data?
I know I know :) OpenAI APIs are not yet available:
Plus, Communication Services APIs are not yet available to use with raw audio stream.
If you have ideas, don't hesitate!
OpenAI GPT 4o model supports both in and out of text, image and audio. Understanding is finer than usual STT > model > TTS approach because the model has direct access to user behavior, emotions, etc.
Is there a way to use Communication Services and receive the raw audio flow, bypassing the STT step?