C-Nedelcu / talk-to-chatgpt

Talk to ChatGPT AI using your voice and listen to its answers through a voice
GNU Affero General Public License v3.0
1.97k stars 331 forks source link

'Stop' word is not working. #9

Closed Yzrsah closed 1 year ago

Yzrsah commented 1 year ago

I want to stop TTS playback by saying 'Stop' but it's not working. I tried various times to make sure I'm saying 'Stop' loudly and clearly, but it does not stop the speech.

C-Nedelcu commented 1 year ago

What language do you use? Does the voice recognition API recognize the word "stop" ?

Yzrsah commented 1 year ago

English. The voice recognition API does recognize "stop" and I'm able to use it conversationally without issue, but it does not recognize the word until after the console log says "I'm listening" so therefore I can't use it to interrupt the TTS.

C-Nedelcu commented 1 year ago

OK I understand. That's because the voice recognition is not active while the bot speaks.

The reason why voice recognition is not active while the bot speaks is because if it were, the computer would listen to itself. I have done this accidentally a few times while developing. Then it creates an infinite loop (the AI voice is recognized and it sends messages to itself, repeating itself over and over)

Yzrsah commented 1 year ago

Could it be possible to have an option to make voice recognition active but only listening for the 'Stop' and 'Pause' words? A setting could help since some people maybe using headphones where the issue would not occur. How would you use 'Pause' if it's only active when the bot is not speaking?

C-Nedelcu commented 1 year ago

I don't think what you are asking is possible. At least I don't see how it would work. The Speech recognition API has two states, active or inactive. You can't make it "active, but just for some words". It's just not how it works.

The 'Stop' and 'Pause' words are for the speech recognition. If you want to pause or stop the bot's voice you need to manually click one of the icons on the top right corner.

Yzrsah commented 1 year ago

I don't mean the speech recognition API itself would need to be active only for some words, I mean you would just filter the recognized text for the Stop and Pause phrases after speech recognition finishes. To fix the problem with feedback loops or false flagging you could set your own 'Stop Phrase' that is more rare than just the word "Stop", I.e. the phrase "Stop Speech" is not something that the bot will say commonly. It's a pretty important feature to work out because sometimes the majority of time spent in a conversation is taken up by waiting for incorrect responses to finish.

C-Nedelcu commented 1 year ago

I understand, and that is why I added a Skip button and even more options such as turning off bot's voice altogether. I am not going to go in that direction because it's very possible that the bot could say these words by itself and it would be considered as a problematic bug. I am afraid if you want to skip the current message you will have to click the Skip button. I have made up my mind on this issue, I apologize if this doesn't go your way but this is a design decision.