Pedal-Intelligence / saypi-userscript

An independent voice interface for Inflection AI's conversational assistant, Pi
https://www.saypi.ai/
Other
15 stars 3 forks source link

Implement Detection and Termination of Unintended Ambient Transcription #80

Closed rosscado closed 2 months ago

rosscado commented 3 months ago

In the current implementation of Say, Pi, if a user leaves a conversation open without explicitly ending it, the extension continues to transcribe any ambient audio indefinitely. This can lead to unintended transcription of background noise or irrelevant audio, such as a TV playing while the user is asleep. As a result, it can cause unnecessary API usage and increased costs.

To address this issue, we need a mechanism to detect when a conversation is likely transcribing unintended ambient audio for an extended period and automatically terminate it to prevent further transcription.

Proposed Solution:

  1. Set a fixed "Ambient Transcription Timeout" duration (e.g., 1 hour) after which a conversation should be considered as potentially transcribing unintended ambient audio and be automatically terminated.

  2. Implement a timer in the Say, Pi extension to track the duration of continuous transcription:

    • Start the timer when the first transcription is sent to Pi in a new conversation.
    • Keep the timer running as long as transcriptions are being continuously sent.
    • If there is a significant pause in transcription (e.g., no transcriptions for 5 minutes), consider the conversation as "paused" and stop the timer.
    • If transcription resumes after a pause, restart the timer from zero.
  3. If the continuous transcription timer reaches the fixed "Ambient Transcription Timeout" duration:

    • Display a humorous prompt to the user, such as:
      Prompt: "Wow, this is a long conversation. Are you still there, or are we talking to ourselves?"
      <timer ticking down to zero>
      Button: "Yes, I'm still here!"
    • Start a timer (e.g., 30 seconds) for the user to respond to the prompt.
    • If the user clicks the "Yes, I'm still here!" button within the given time frame, consider the conversation as active and continue transcription.
    • If the user doesn't interact with the prompt before the timer reaches zero, proceed with terminating the conversation.
  4. Terminating a conversation flagged for potential unintended ambient transcription:

    • Stop the speech recognition and transcription processes.
    • Send a closing message to Pi, indicating that the conversation has been terminated due to suspected unintended ambient transcription.
    • Close the active conversation in the Say, Pi interface.

Benefits:

rosscado commented 2 months ago

Closed in version 1.5.13.

https://github.com/Pedal-Intelligence/saypi-userscript/assets/16578183/0fb6e93d-6b36-474b-9af0-36e843b9d287