Closed AdamSobieski closed 5 years ago
Am trying to understand what new functionality is required. For example, in WebRTC 1.0 it is possible to send an audio stream to a remote peer and receive a translation audio stream and/or a transcription on the WebRTC data channel.
With merger of PR https://github.com/w3c/webrtc-nv-use-cases/pull/47, closing this issue.
Introduction
Real-time translation is both an interesting and important use case for a next version of WebRTC.
Speech Recognition, Translation and Speech Synthesis
Approaches to real-time speech-to-speech machine translation include those which interconnect speech recognition, translation and speech synthesis components and services. In that regard, we can consider client-side, on-prem, server-side, third-party and cloud-based components and services. In that regard, we can also consider both free and priced components and services.
We can envision post-text speech technology and machine translation components and services. Speech recognition need not output to text; we can consider speech-to-SSML. Machine translation need not input from nor output to text; we can consider SSML-to-SSML machine translation. Components and services may provide various options with respect to their input and output data formats.
Connecting Components and Services by Constructing Graphs
We can consider APIs which facilitate the construction of graphs which represent the flow of data between components and services. As these graphs are constructed, users could be apprised of relevant notifications, requests for permissions and options for payments. As these constructed graphs are activated, a number of protocols could be utilized to interconnect the components and services which, together, provide users with real-time translation.
Hyperlinks
WebRTC Translator Demo Real Time Translation in WebRTC