mallorbc / whisper_mic

Project that allows one to use a microphone with OpenAI whisper.
MIT License
674 stars 154 forks source link

Many incomplete segments, what is it even returning? predicted_text referenced before assignment. #71

Closed SuppliedOrange closed 3 months ago

SuppliedOrange commented 4 months ago

Error:

  File "C:\Users\Dhruv\Desktop\py\AudioProcessorWhisper\whisper_mic_test.py", line 5, in <module>
    result = mic.listen()
  File "C:\Users\Dhruv\AppData\Local\Programs\Python\Python310\lib\site-packages\whisper_mic\whisper_mic.py", line 215, in listen
    self.__listen_handler(timeout, phrase_time_limit)
  File "C:\Users\Dhruv\AppData\Local\Programs\Python\Python310\lib\site-packages\whisper_mic\whisper_mic.py", line 132, in __listen_handler
    self.__transcribe(data=audio_data)
  File "C:\Users\Dhruv\AppData\Local\Programs\Python\Python310\lib\site-packages\whisper_mic\whisper_mic.py", line 184, in __transcribe
    if predicted_text not in self.banned_results:
UnboundLocalError: local variable 'predicted_text' referenced before assignment

Code:

from whisper_mic import WhisperMic

mic = WhisperMic( mic_index=2, save_file=True, english=True )

result = mic.listen()
print(result)

I solved it with this:

    def __transcribe(self,data=None, realtime: bool = False) -> None:
        if data is None:
            audio_data = self.__get_all_audio()
        else:
            audio_data = data
        audio_data,is_audio_loud_enough = self.__preprocess(audio_data)

        if is_audio_loud_enough:
            # faster_whisper returns an iterable object rather than a string
            predicted_text = '' # I MOVED THIS HERE, ON TOP <-------

            if self.faster:
                segments, info = self.audio_model.transcribe(audio_data)
                for segment in segments:
                    predicted_text += segment.text
            else:
                if self.english:
                    result = self.audio_model.transcribe(audio_data,language='english',suppress_tokens="") # This is what i need but it doesnt return?
                else:
                    result = self.audio_model.transcribe(audio_data,suppress_tokens="")
                    predicted_text = result["text"]

            if not self.verbose:
                if predicted_text not in self.banned_results:
                    self.result_queue.put_nowait(predicted_text)
            else:
                if predicted_text not in self.banned_results:
                    self.result_queue.put_nowait(result)

            print('predicted_text ' + predicted_text)

            if self.save_file:
                # os.remove(audio_data)
                self.file.write(predicted_text)

I decided to modify the code to actually return something and put something in the predicted text.

mallorbc commented 4 months ago

The code is already like this though? https://github.com/mallorbc/whisper_mic/blob/d813c0f16b455b1495d22b83dc0787dae65d8c5d/whisper_mic/whisper_mic.py#L169-L193

Perhaps I neglected to push a new release with this? I will look into in when I get some time

mallorbc commented 4 months ago

With regards to how to get the text back, the result is being written to a queue. You get the result from pulling from the queue.

SuppliedOrange commented 4 months ago

I installed this from pip, so yeah probably. Sorry if my post came off as a little rude, I just re-read it. I hope it gets fixed!

mallorbc commented 4 months ago

I installed this from pip, so yeah probably. Sorry if my post came off as a little rude, I just re-read it. I hope it gets fixed!

I did not find it rude. Thanks for bringing it to my attention! People like you help make Open Source better!

mallorbc commented 3 months ago

This should be fixed. Please let me know if it is not