airyhq / airy

💬 Open Source App Framework to build streaming apps with real-time data - 💎 Build real-time data pipelines and make real-time data universally accessible - 🤖 Join historical and real-time data in the stream to create smarter ML and AI applications. - ⚡ Standardize complex data ingestion and stream data to apps with pre-built connectors
https://airy.co/docs/core
Apache License 2.0
369 stars 44 forks source link

Twilio Voice Source & Realtime Speech-to-Text Transcription #2281

Open steffh opened 3 years ago

steffh commented 3 years ago

Is your feature request related to a problem? Please describe.

As an Airy user I want to be able to connect voice sources to handle incoming calls from a Twilio phone number, SO THAT I can (i) transcribe them from speech to text in real time or (ii) gather user input from the keypad (pressing "1"), SO THAT I can integrate with Conversational AI to understand the meaning and/or fetch a template with static content for a potential reply, SO THAT I can automatically respond back to the caller via the Twilio Voice API generating voice from text.

Describe the solution you'd like

Describe alternatives you've considered There are different providers for phone number management & voice management APIs, as well as Speech-to-Text solutions available including open source tools (https://github.com/mozilla/DeepSpeech).

Additional context

ghost commented 3 years ago

Twilio also does text-to-speech and speech-to-text and decides when and how to discretize the voice input (when is a contact finished with a sentence or phrase). That may be a simpler integration to start with, but the audio streaming would definitely be a cool integration as well.