Open steffh opened 3 years ago
Twilio also does text-to-speech and speech-to-text and decides when and how to discretize the voice input (when is a contact finished with a sentence or phrase). That may be a simpler integration to start with, but the audio streaming would definitely be a cool integration as well.
Is your feature request related to a problem? Please describe.
As an Airy user I want to be able to connect voice sources to handle incoming calls from a Twilio phone number, SO THAT I can (i) transcribe them from speech to text in real time or (ii) gather user input from the keypad (pressing "1"), SO THAT I can integrate with Conversational AI to understand the meaning and/or fetch a template with static content for a potential reply, SO THAT I can automatically respond back to the caller via the Twilio Voice API generating voice from text.
Describe the solution you'd like
Support to connect phone numbers as channels
Support for Real time transcription e.g. via Google Speech-to-Text (https://www.twilio.com/blog/live-transcribing-phone-calls-using-twilio-media-streams-and-google-speech-text)
Describe alternatives you've considered There are different providers for phone number management & voice management APIs, as well as Speech-to-Text solutions available including open source tools (https://github.com/mozilla/DeepSpeech).
Additional context