Adri6336 / gpt-voice-conversation-chatbot

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.
GNU General Public License v3.0
300 stars 50 forks source link

Audio transcription super slow right now #7

Closed Gottano closed 1 year ago

Gottano commented 1 year ago

Hi Adrian

I noticed it was taking ages for the transcription to happen, so I've just downloaded your most recent version, but it is still taking a very long time (up to a minute or longer for a short phrase). It often also just gets "stuck" in listening mode (green window) for a very long time.

What could be the cause?

Many thanks

Adri6336 commented 1 year ago

Heyo!

The transcription service is handled by Google, so transcription speed depends on their responsivity and the speed of your connection to them.

The stuck listening has to do with the way the speech recognizer works. It seems to try and keep recording as long as the user is speaking, but doesn't have a way to determine if what it's hearing is talking. Consequentially, I've noticed that if there's a lot of ambient noise it can hang on a bit. By default, my bot listens to the environment for around a second to help filter out background frequencies, but this may not be enough time depending on the environment. I'll look into giving users the ability to set ambient noise filtering time if they'd like, but outside of this the recording method is unfortunately out of my control.

Sorry for the inconvenience yo

Gottano commented 1 year ago

Hiya Adrian

Understood.

Yes, seems to be a Google related issue as I've just tested another voice / GPT bot (much simpler than yours) that also uses Google STT and it also hangs.

Would it be possible to add a third party STT / transcription service to it? Openai, Amazon, Microsoft - they all offer this?

On Thu, 23 Mar 2023 at 17:20, Adrian @.***> wrote:

Heyo!

The transcription service is handled by Google, so transcription speed depends on their responsivity and the speed of your connection to them.

The stuck listening has to do with the way the speech recognizer works. It seems to try and keep recording as long as the user is speaking, but doesn't have a way to determine if what it's hearing is talking. Consequentially, I've noticed that if there's a lot of ambient noise it can hang on a bit. By default, my bot listens to the environment for around a second to help filter out background frequencies, but this may not be enough time depending on the environment. I'll look into giving users the ability to set ambient noise filtering time if they'd like, but outside of this the recording method is unfortunately out of my control.

Sorry for the inconvenience yo

— Reply to this email directly, view it on GitHub https://github.com/Adri6336/gpt-voice-conversation-chatbot/issues/7#issuecomment-1481841295, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQUWAS3LBXEE7Z6SZW4QB33W5SWA3ANCNFSM6AAAAAAWFUUO7A . You are receiving this because you authored the thread.Message ID: @.***>

Adri6336 commented 1 year ago

OpenAI has that Whisper service. I'll def look into adding it!