ITSpecialist111 / ai_automation_suggester

This custom Home Assistant integration periodically scans entities, detects new devices, and uses AI (via OpenAI's API) to suggest automations. It provides a user-friendly interface for accepting or rejecting automations, with placeholder entity mapping for privacy. This aims to enhance smart home automation via AI to identify new automations.
MIT License
8 stars 0 forks source link

[Feature Request] - TTS output of suggestions .wav/other #5

Closed BestMasterChief closed 1 day ago

BestMasterChief commented 1 day ago

Currently, the output is limited to text only, and due to character constraints in the notifications section, we need to keep it concise. To overcome this limitation and make the suggestions more conversational, we should consider implementing a .wav or TTS (Text-to-Speech) output. Note: For TTS output, ensure that entity names are replaced with "real names" to prevent mispronunciation or garbled speech.

ITSpecialist111 commented 1 day ago

Hi @BestMasterChief,

Thank you for your suggestion regarding TTS or .wav output for the automation suggestions. While I can see the potential benefits of making the suggestions more conversational, I believe there are a few important factors we need to carefully consider before moving forward with such a feature:

  1. Technical Complexity: Implementing Text-to-Speech (TTS) or .wav output introduces significant complexity. This involves not just converting the text, but also managing the integration with Home Assistant’s built-in TTS services, supporting various languages and accents, and handling the audio files or streams. Additionally, mapping entity names to "real names" will require further customisation to ensure proper pronunciation, which could be error-prone or require additional user input to manually configure these names for each entity.

  2. Notification Mechanism: The current persistent notification system in Home Assistant is designed for text-based feedback. Moving to an audio format like .wav or TTS would mean overhauling how these notifications are presented and received. This could limit the ease of accessing and reviewing suggestions compared to the current system, where users can read and quickly act on the text.

  3. User Preferences and Accessibility: While TTS may be helpful in some scenarios, many users may prefer to stick with text-based output, as it's easier to reference at any time. Audio notifications can also be intrusive, especially in shared spaces where not everyone may want to hear these suggestions out loud.

  4. Possible Alternatives: If the primary concern is the character limitation in notifications, we could consider breaking down longer suggestions into multiple notifications, or even sending suggestions through other channels such as emails or mobile push notifications, where there’s more space for detailed descriptions.

Instead of focusing on TTS or .wav, we could enhance the clarity and conciseness of the text-based suggestions to ensure they are both actionable and fit within the existing character constraints.