KoljaB / RealtimeSTT

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
MIT License
2.09k stars 190 forks source link

STT: UnpicklingError: invalid load key, '\x00' #97

Closed she7ata7 closed 3 months ago

she7ata7 commented 3 months ago

When I try to speak with a low voice or not clear enough I get this exception

The code:

class STT_Model(object):
    def __new__(cls):
        if not hasattr(cls, 'instance'):
            recorder_config = {
                'use_microphone': False,
                'spinner': False,
                'model': 'large-v2',
                'language': MODEL_LANGUAGE,
                'silero_sensitivity': 0.4,
                'webrtc_sensitivity': 2,
                #'post_speech_silence_duration': 0.4,
                'min_length_of_recording': 0,
                'min_gap_between_recordings': 0,
                'enable_realtime_transcription': True,
                'realtime_processing_pause': 0.2,
                'realtime_model_type': 'medium',
            }
            cls.instance = super(STT_Model, cls).__new__(cls)
            cls.instance.recorder = AudioToTextRecorder(**recorder_config)
        return cls.instance
def recorder_thread(self):
        self.recorder_ready.set()

        while True:
            stt_sentence = self.recorder.text()
            self.recorder.stop()
            print(f"\r STT_sentence: {stt_sentence}")

The exception:

  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/troido/.local/lib/python3.10/site-packages/RealtimeSTT/audio_recorder.py", line 751, in _transcription_worker
    audio, language = conn.recv()
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
_pickle.UnpicklingError: invalid load key, '\x00'.
Exception in thread Exception in thread Thread-59 (recorder_thread)Thread-8 (recorder_thread):
:
Traceback (most recent call last):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/troido/Desktop/ai_assistant/sip/audio_media_port.py", line 72, in recorder_thread
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    stt_sentence = self.recorder.text()
  File "/home/troido/.local/lib/python3.10/site-packages/RealtimeSTT/audio_recorder.py", line 1048, in text
        self._target(*self._args, **self._kwargs)return self.transcribe()

  File "/home/troido/Desktop/ai_assistant/sip/audio_media_port.py", line 72, in recorder_thread
  File "/home/troido/.local/lib/python3.10/site-packages/RealtimeSTT/audio_recorder.py", line 959, in transcribe
    self.parent_transcription_pipe.send((self.audio, self.language))
      File "/usr/lib/python3.10/multiprocessing/connection.py", line 206, in send
stt_sentence = self.recorder.text()
      File "/home/troido/.local/lib/python3.10/site-packages/RealtimeSTT/audio_recorder.py", line 1048, in text
self._send_bytes(_ForkingPickler.dumps(obj))
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 405, in _send_bytes
    self._send(buf)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
    return self.transcribe()
  File "/home/troido/.local/lib/python3.10/site-packages/RealtimeSTT/audio_recorder.py", line 960, in transcribe
    status, result = self.parent_transcription_pipe.recv()
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 414, in _recv_bytes
    buf = self._recv(4)
  File "/usr/lib/python3.10/multiprocessing/connection.py", line 383, in _recv
    raise EOFError
EOFError
she7ata7 commented 3 months ago

I just created a singleton object from STT and reused it many times, so I need a way to How to shut down it and restart it safely.

For shutdown: I use self.recorder.stop()
And once I need to listen again: I use self.recorder.start()

But I got these errors: STT exception occurred: Ran out of input STT exception occurred: invalid load key, '\x00'.

KoljaB commented 3 months ago

self.recorder.stop() and self.recorder.start() are for triggering manual recording. You can just reuse the recorder after calling recorder.text(). There is shutdown(self) to destroy the recorder completely but i would not recommend to create and destroy a recorder multiple times during a session. There is no need for that and it consumes resources, because at the next creation the recorder will need to initialize and reload the model.

she7ata7 commented 3 months ago

I stopped using them but I still get these errors when I try to reuse the the same STT instance again

STT exception occurred: Ran out of input
STT Restart again
STT exception occurred: invalid load key, '\x00'.
recorder_thread = threading.Thread(target=self.recorder_thread)
        recorder_thread.start()
        self.recorder_ready.wait()
def recorder_thread(self):
        self.recorder_ready.set()

        while self.recorder.is_running:

            try:
                stt_sentence = self.recorder.text()
                print(f"\r STT_sentence: {stt_sentence}")

                # .....

            except Exception as e:
                print(f"STT exception occurred: {e}")
                continue
she7ata7 commented 3 months ago

Do you have a way to clear the pipeline of processing the audio chunks ?

KoljaB commented 3 months ago

Do you have a way to clear the pipeline of processing the audio chunks ?

No, didn't feel there's a need for that. Just add a method that loops self.audio_queue.get until its empty.

For you other prob - please send me a short full working .py codefile reproducing that error to my mail kolja.beigel@web.de (or post full reproducing code here). When I loop self.recorder.text() this does not happen here, so I guess it's something around that.

she7ata7 commented 3 months ago

Thanks so much I have solved the issue (It was a threading problem with my code).