eidolon-ai / eidolon

The first AI Agent Server, Eidolon is a pluggable Agent SDK and enterprise ready, deployment server for Agentic applications
https://www.eidolonai.com/
Apache License 2.0
284 stars 30 forks source link

[ problem ] No easy way to enable microphone for speech-to-text #793

Open flynntsang opened 1 month ago

flynntsang commented 1 month ago

Describe the problem There's no way I can find in Chrome to enable the microphone for speech input.

To Reproduce Steps to reproduce the behavior:

  1. Use an Eidolon webui app with "allowSpeech" = true in webui.apps.json (e.g. the chatbot sample)

  2. Click the purple microphone icon.

  3. Read the popup error: Please allow access to the microphone in your browser. Screen Shot 2024-09-23 at 1 25 33 PM

  4. Go to chrome://settings/content/microphone

  5. See that there is no way for users to enter which sites can have access to the microphone. Screen Shot 2024-09-23 at 1 25 01 PM

Expected behavior I expected clicking the microphone would initiate a request to enable the microphone for the Eidolon webui app.

Environment Chrome Version 129.0.6668.58 (Official Build) (x86_64), MacOS

flynntsang commented 1 month ago

According to my system settings, Chrome does have permission to use my microphone.

Image

LukeLalor commented 1 month ago

Notes for implementation

The apu can now handle voice files directly, we don't need a separate agent anymore for the translation and should likely push the audio directly through the apu, adding this as a flag to simple agent to enable it. This then becomes an additional "action" on the agent that the chatbot ui is aware of.