msqr1 / Vosklet

A speech recognizer that can run on the browser, inspired by vosk-browser
MIT License
33 stars 1 forks source link

Methods for Pause and Continue recognition #15

Closed xiz2020 closed 2 months ago

xiz2020 commented 2 months ago

Do you have a methods for pause recognition and continue?

For example:

this.recognizer.addEventListener("result", ev => {          
console.log('Is Speaking:', window.speechSynthesis.speaking)    
var data = JSON.parse(ev.detail)
// At this point when (window.speechSynthesis.speaking) I don't need to process results. The problem here: event when window.speechSynthesis.speaking == false I still get results from data.text. 
if (data && data.text && data.text.length && !window.speechSynthesis.speaking) {    
    this.ring('out')
}
})
msqr1 commented 2 months ago

Suspend the audio context that is connected to your transferer, ie. call AudioContext.suspend() to pause. To resume, call AudioContext.resume()@xiz2020

xiz2020 commented 2 months ago

Here is a demo:

this.utterance.onstart = (e) => {                   
        console.log('Start Speak:', this.audioContext)
        if (this.audioContext) {
            this.audioContext.suspend()
        }
}

this.utterance.onend = (e) => {
        console.log('End Speak:', this.audioContext)
        if (this.audioContext) {    
            console.log('Resume Reco');
            this.audioContext.resume()
        }
}

For some reason I unable to recieve recognized text after: this.audioContext.resume() instruction.

msqr1 commented 2 months ago

It seems that you're suspending it when starting speak, and resuming it at the end. Maybe I lost some context, but I think it should be the other way around?

xiz2020 commented 2 months ago

I'm using the SpeechSynthesisUtterance interface of the Web Speech API

When TTS is started (Sound from speaker (this.utterance.onstart)) - I need to stop recognition process (this.audioContext.suspend()) - and this seems to be working.

When TTS is ended (this.utterance.onend) - I need to resume recognition from human voice input. And here is a problem: The command - this.audioContext.resume() - doesn't work because I unable to receive recognized text anymore.

Any other ideas to make it possible?
Thanks.

msqr1 commented 2 months ago

It could be that resume and suspend are asynchronous methods and need awaiting/thening. Usually, this is the best way to stop recognizing because it prevent any processing overhead incurred because the audio stream is blocked. Otherwise, another way would be to unbind/remove the event listener on the transferer and rebind them later

msqr1 commented 2 months ago

@xiz2020 Could you tell me which way solved it for you?