mallorbc / whisper_mic

Project that allows one to use a microphone with OpenAI whisper.
MIT License
674 stars 154 forks source link

Proccess hanging in infinite loop when input audio is not loud enough #67

Closed Pop-Vlad closed 2 weeks ago

Pop-Vlad commented 5 months ago

Running the followig code results in the process getting stuck when not speaking loud enough:

mic = WhisperMic()
result = mic.record(duration=2)
print(result)

The issue seems to come from the function __transcribe, where self.result_queue is filled only when is_audio_loud_enough is true:

    if is_audio_loud_enough:
        # faster_whisper returns an iterable object rather than a string
        if self.faster:
            segments, info = self.audio_model.transcribe(audio_data)
            predicted_text = ''
            for segment in segments:
                predicted_text += segment.text
        else:
            if self.english:
                result = self.audio_model.transcribe(audio_data,language='english',suppress_tokens="")
            else:
                result = self.audio_model.transcribe(audio_data,suppress_tokens="")
                predicted_text = result["text"]

        if not self.verbose:
            if predicted_text not in self.banned_results:
                self.result_queue.put_nowait(predicted_text)
        else:
            if predicted_text not in self.banned_results:
                self.result_queue.put_nowait(result)

As a result, the functions listen and record get stuck in the following loop, because self.result_queue is always empty:

    while True:
        if not self.result_queue.empty():
            return self.result_queue.get()
FontaineRiant commented 3 months ago

Here's what you can do in the meantime:

from whisper_mic import WhisperMic, get_logger

class CustomMic(WhisperMic):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def listen(self, timeout=None, phrase_time_limit=None):
        self.logger.info("Listening...")
        while self.result_queue.empty():
            self._WhisperMic__listen_handler(timeout, phrase_time_limit)
        while True:
            if not self.result_queue.empty():
                return self.result_queue.get()

    def record(self, duration=None, offset=None):
        self.logger.info("Listening...")
        while self.result_queue.empty():
            self._WhisperMic__record_handler(duration, offset)
        while True:
            if not self.result_queue.empty():
                return self.result_queue.get()

And use that custom class instead

mallorbc commented 2 weeks ago

83 should have fixed this. Please reopen if not.