Open mariankh1 opened 1 month ago
Documenting Voice Integration library options in this Notion document - https://www.notion.so/Text-to-Speech-TTS-Libraries-10f0a4197a26801b8178f91ac8613812.
The AI mail assistant should let users control it entirely by voice, without needing to touch their device. This is important because it makes the app more convenient, especially when multitasking or for people with physical limitations.
Option 1: Hands-free voice assistant similar to Alexa or Siri (Ideal and Recommended)
Wake Word Detection: This will detect the wake word "Maily" to start listening for commands. More details on available library options can be found here
Speech Recognition: After detecting the wake word, the app should recognize the user's command (e.g., "Fetch my latest email"). More details on available models and library options can be found here
Command Processing: The app processes the recognized command and fetches the requested data (e.g., read unread and summarise the content) [We are already doing this using by interacting with Mistral-Nemo-Instruct-2407 via Hugging Face API] ✅
Conversational Response (TTS): The app uses Text-to-Speech to respond conversationally by converting the text response returned from Hugging face API into voice audio . More details on Text to Speech libraries can be found here
Other options:
Option 2: Listening Continuously
The app continuously listens for the wake word ("Maily"). Once it detects the wake word, it processes the user's command and responds with a summary of unread emails using Text-to-Speech (TTS).
Option 3: Push-to-Talk
The user manually presses a button (on-screen or hardware) to activate the voice assistant and then issues commands. The app then processes the command and responds via TTS.
This is a potentially interesting library https://github.com/gotev/android-speech
A user can communicate with the conversational AI mail assistant either via text or via voice. We need to