Hi,
I'm playing with the application and I'm trying to create a Realtime translation agent.
For Realtime I mean truly Realtime.
When someone starts talking, I need instantly to get the translation of what this person said, without waiting the end of the input stream (for both Push To Talk & VAD).
For the translation prompt, I already succeeded.
From your knowledge, do you think that this stuff could be achieved with OpenAi Realtime API?
Thanks.
PS: I know that starting instantly the translation could be very bad for it's quality, but imagine the case of a real conversation from persons, waiting for the other person to finish talk and start this ping-pong of waits its terrible in terms of UX.
Hi, I'm playing with the application and I'm trying to create a Realtime translation agent. For Realtime I mean truly Realtime.
When someone starts talking, I need instantly to get the translation of what this person said, without waiting the end of the input stream (for both Push To Talk & VAD).
For the translation prompt, I already succeeded.
From your knowledge, do you think that this stuff could be achieved with OpenAi Realtime API?
Thanks.
PS: I know that starting instantly the translation could be very bad for it's quality, but imagine the case of a real conversation from persons, waiting for the other person to finish talk and start this ping-pong of waits its terrible in terms of UX.