daanzu / deepspeech-websocket-server

Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments
Mozilla Public License 2.0
101 stars 32 forks source link

Feature request: ability to easily pause listening #6

Closed nmstoker closed 4 years ago

nmstoker commented 4 years ago

I'll try to have an explore myself but I was wondering if there was a straightforward approach that would allow for the listening to be paused.

Would something with calling stop_stream on the Pyaudio stream and then later starting it hi again work?

Alternatively if there's progress on #2 then I could possibly integrate it that way, having the process start and stop when required.

nmstoker commented 4 years ago

Quick update: this proved fairly easy to do :slightly_smiling_face:

Am including details in case others wish to attempt something similar. I'm working with a slightly earlier version than the latest code here but I think the principle should be the same.

Within the Audio class in client.py create two functions to stop and start the stream, but importantly the stop (for pausing) does not close the stream, leaving it ready to be restarted.

def pause(self):
    """Temporarily stop the stream listening."""
    self.stream.stop_stream()

def restart(self):
    """Restart the stream listening (when previously paused)."""
    self.stream.start_stream()

Then within the section where the recognised text is output (ie where you want to do your actions without the mic continuing to listen) call pause() on the vad_audio object at the start of whatever check you're doing and then at the end of the check call restart() and it should be good.

This is handy for my project as I've got it outputting spoken replies using TTS and if you don't pause it then the replies get recognised and it goes into a loop!

Thanks for this v. handy repo @daanzu

daanzu commented 4 years ago

@nmstoker I'm glad it was easy, and thanks for posting the solution!

pvtoan commented 5 months ago

Hi,

Could you please write an example of "Audio class " including both pause and restart function?