WebRTC and Real-time Translation

AdamSobieski commented 6 years ago

Introduction

Real-time translation is both an interesting and important use case for a next version of WebRTC.

Speech Recognition, Translation and Speech Synthesis

Approaches to real-time speech-to-speech machine translation include those which interconnect speech recognition, translation and speech synthesis components and services. In that regard, we can consider client-side, on-prem, server-side, third-party and cloud-based components and services. In that regard, we can also consider both free and priced components and services.

We can envision post-text speech technology and machine translation components and services. Speech recognition need not output to text; we can consider speech-to-SSML. Machine translation need not input from nor output to text; we can consider SSML-to-SSML machine translation. Components and services may provide various options with respect to their input and output data formats.

Connecting Components and Services by Constructing Graphs

We can consider APIs which facilitate the construction of graphs which represent the flow of data between components and services. As these graphs are constructed, users could be apprised of relevant notifications, requests for permissions and options for payments. As these constructed graphs are activated, a number of protocols could be utilized to interconnect the components and services which, together, provide users with real-time translation.

Hyperlinks

WebRTC Translator Demo Real Time Translation in WebRTC

aboba commented 5 years ago

Am trying to understand what new functionality is required. For example, in WebRTC 1.0 it is possible to send an audio stream to a remote peer and receive a translation audio stream and/or a transcription on the WebRTC data channel.

aboba commented 5 years ago

With merger of PR https://github.com/w3c/webrtc-nv-use-cases/pull/47, closing this issue.

w3c / webrtc-nv-use-cases