hahahumble / speechgpt

💬 SpeechGPT is a web application that enables you to converse with ChatGPT.
https://speechgpt.app
MIT License
2.74k stars 404 forks source link

Feature Request: A few suggestions to enhance speechgpt user experience #40

Closed erbanku closed 1 year ago

erbanku commented 1 year ago

Is your feature request related to a problem?

My feature request is related to several problems I am experiencing while using the current version of the speechgpt. I am frustrated when:

  1. The keyboard remains visible even after completing my input, which takes up unnecessary screen space and makes it harder to read the chat.
  2. The keyboard still shows up while I interact with the assistant using speech recognition, which is unnecessary in that scenario and can be distracting.
  3. Many average users need clarification on setting the speech recognition/synthesis language and language ID. So, I prefer an easier way to do this through environment variables and let the average users use it more easily with default configurations.
  4. When the assistant generates a lengthy response, I have to wait for the honest answer to be developed before I can listen or read it. Streaming output for both text and TTS would make this process smoother and more enjoyable.
  5. I often want to replay the assistant's response or my input via TTS but cannot curate more so, which can be inconvenient when I need to review previous interactions.

Describe the solution you'd like

Additional context

No response

hahahumble commented 1 year ago
  1. This is a great suggestion.
  2. My plan is to add an option that allows users to choose whether to display the keyboard during speech recognition, as speech recognition may produce errors, and displaying the keyboard would allow users to quickly correct mistakes.
  3. Different services have different supported languages and voices, so using environment variables for configuration might be complicated.
  4. Currently, I have not found any TTS API that supports streaming. A possible solution is to split the assistant's responses into multiple sentences and send multiple requests.
  5. This feature will be supported in future updates.

Thank you very much for your suggestions.

Misaka-9982-coder commented 1 year ago

Perhaps these two bots can bring some inspiration. Samantha: https://t.me/samantha_x64_bot Sherlock: https://t.me/sherlock_myshell_ai_bot

hahahumble commented 1 year ago

Suggestions 1, 2, and 5 have been resolved