AndraxDev / speak-gpt

Your personal voice assistant based on OpenAI ChatGPT.
https://play.google.com/store/apps/details?id=org.teslasoft.assistant
Apache License 2.0
267 stars 58 forks source link

Microphone input doesn't work #4

Closed CaptSpify closed 1 year ago

CaptSpify commented 1 year ago

I can type into the message box and it works just fine, but when I use the microphone, nothing happens. I've confirmed that the app has permissions to the microphone, and other apps are able to use the microphone.

Let me know if any more data is needed

Thanks

I can't believe that I forgot to put my phone info on here |:P

Android version: 13 Pixel 4a

AndraxDev commented 1 year ago

Can you please send me device info: Android version, ROM, device model, app version and related screenshots. I tested this function and microphone passed test on multiple devices. Thanks

AndraxDev commented 1 year ago

Another common reason of this problem is missing Speech Services and Google Play Services. Please check if you have these apps installed and permissions are set for these apps.

Please note that custom ROMs (like LineageOS) may not have these apps installed.

If this is not fixed your problem send me device info described in previous comment. Thanks

AndraxDev commented 1 year ago

Also please let me know if you compiled app from sources, or you installed it from Store?

ubergeek77 commented 1 year ago

I have this issue too, but I am on a de-Googled ROM without Google Play Services (CalyxOS). It happened when I compiled from source and also from Google Play (sideloaded).

Interestingly, I have installed an alternative voice service and set it as my default text to speech app, but SpeakGPT still doesn't pick it up. I also tried installing Google Speech Services manually, but couldn't get it to work in SpeakGPT.

I know it's a less common use case, but I'd love to have full voice support on SpeakGPT without need of Google services. My phone is de-Googled, so I can't use Google Assistant either. If SpeakGPT could support voice input without Google services, it could work as a full Google Assistant replacement for people on de-Googled phones. That's pretty cool if you ask me.

I do find Whisper to be alarmingly fast and accurate, sometimes better than Google Speech Services in some cases. Maybe it could be added to SpeakGPT at some point as a roundabout solution to this Google issue? It's OpenAI too!

Here are some constructive resources and even one Android app that has implemented Whisper offline and on-device:

https://github.com/MichaelMcCulloch/WhisperVoiceKeyboard

https://github.com/ggerganov/whisper.cpp

https://github.com/openai/whisper

CaptSpify commented 1 year ago

Wow, what a great response time! Alas, I am running calyxos (4.7.5), and I don't have Speech Services or Play Services installed App version: 2.8, installed from the store

The current apps that work are Dicio and Vosk Demo Am I correct in thinking that those other apps might be working as they built in those services themselves?

Thanks

AndraxDev commented 1 year ago

Great, now I pointed that this issue occurs on de-Googled ROMs. So I will integrate whisper soon.

Please note that Whisper is also paid API so please read about prices here: https://openai.com/pricing

Also please note that whisper requires Internet connection so bot responses may be delayed.

In future versions you will able to switch between Google and Whisper.

Thanks

ubergeek77 commented 1 year ago

Thanks! But I am a little confused, there is an offline model is there not? I've used Whisper in the past to transcribe meetings on my own device before, it worked great with no API required. I would say it was pretty fast, probably fast enough for a voice assistant if one of the lighter models is used.

Perhaps the local version is not fast enough for this use case?

AndraxDev commented 1 year ago

Whisper is a voice recognition API from OpenAI. It requires Internet connection. This is what Google Services weights a lot :)

ubergeek77 commented 1 year ago

Whisper is the name of a paid API OpenAI offers, but it is also the name of the free, open source, and offline model pre-trained for fast local detection. The first GitHub repo I linked has a video of it working offline on-device, and OpenAI's own GitHub repo (openai/whisper) has free Python packages listed you can use to test its capability offline and for free. The whisper.cpp I linked is a reimplementation of the same model suitable for cross-platform offline speech to text.

I think offline detection would be most useful in SpeakGPT, especially since API costs will likely add up if we consider the API for ChatGPT is also paid.

But, regardless of what you decide, thanks for all your hard work on SpeakGPT :)

AndraxDev commented 1 year ago

Requested features has been released.

None

Available only online version of Whisper. Offline version can not be released because model weights a lot and requires too much RAM.

AndraxDev commented 1 year ago

Since now I will publish APKs so you don't need to recompile app from sources.

CaptSpify commented 1 year ago

Works great now with whisper, thanks!