Closed hanxirui closed 1 week ago
Damn. Researched it and yes, MacOS does not support queue.qsize() method from the multiprocessing module.
Need to find a workaround for this. Sorry for this issue.
Updated audio_recorder.py to a new version which hopefully fixes this (not available with pip install yet). Would be great to hear feedback, if that works.
Fix now available also with pip install (untested though, unfortunately I have no Mac):
pip install --upgrade realtimestt==0.1.7
Any idea this error?
Say something...
RealTimeSTT: root - WARNING - Audio queue size exceeds latency limit. Current size: 84. Discarding old audio chunks.
zsh: segmentation fault PYTHONPATH=. python tests/simple_test.py
Process Process-2:
Traceback (most recent call last):
File "/Users/xiaopel/opt/anaconda3/envs/torch2/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/Users/xiaopel/opt/anaconda3/envs/torch2/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/Users/xiaopel/Github/Startup/RealtimeSTT/RealtimeSTT/audio_recorder.py", line 369, in _transcription_worker
audio, language = conn.recv()
File "/Users/xiaopel/opt/anaconda3/envs/torch2/lib/python3.9/multiprocessing/connection.py", line 255, in recv
buf = self._recv_bytes()
File "/Users/xiaopel/opt/anaconda3/envs/torch2/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes
buf = self._recv(4)
File "/Users/xiaopel/opt/anaconda3/envs/torch2/lib/python3.9/multiprocessing/connection.py", line 388, in _recv
raise EOFError
EOFError
I'm sorry, I can't really tell what's going wrong here. After bit of research it seems Mac does have some issues with pythons multiprocessing. Maybe it's worth a try with a newer python version, python 3.9 is already two years old.
thanks @KoljaB . I get around this by use a transcribe function, instead of sending data to transcribe via Pipe. From the design, it seems that there is no need for starting the process of transcription_worker. Anything I missed?
If you run recorder.text() in a loop the transcription of the last sentence pulls so much resources, that the voice activity detection is not reliable in the time the transcription runs. This is a problem if the transcription needs some time (long last sentence) and the next sentence is very short (depends on VAD). Then the short next sentence would not be detected. So basically it is a fix for a quite specialized problem. I did not realize that multiprocessing would introduce that many new problems as it did, especially for non-windows platforms.
@eelxpeng what changes did you make in the audio_recorder.py, when replacing transcription_worker with transcribe, can you post the diff?
Maybe this checkpoint helps, that was before introducing multiprocessing.
Thanks but that causes the https://github.com/KoljaB/RealtimeSTT/issues/3 issue with stream closed, also torch with faster-whisper is already an issue: https://github.com/SYSTRAN/faster-whisper/issues/137
Still the same issue - tried on python 3.11 and 3.12. Will take a look later.
Great work. KoljaB! I found the fix after breaking my head for a while in macOSX. replace the multiprocessing Queue with Manager.Queue. It works perfectly.
I don't want the wake words and other aspects. So i had to strip out certain aspects of the code. Still it serves my purpose.
Another thing I noticed was the device index. It works without passing, and in my case the mic was on the device 1. took me a while to list channels & identify the right value.
from multiprocessing import Manager
manager = Manager()
queue = manager.Queue()
# ... use the queue ...
if queue.qsize() > 0: # Check for elements
print("Queue has elements.")
Thanks a lot for this hint. I recently switched RealtimeSTT (and RealtimeTTS) from multiprocessing to torch.multiprocessing. Is the problem still existing with v1.9.0? (I hope to be lucky and the switch to torch.multiprocessing does the same for macOSX). For the device_index I prob need to add an option to list the devices.
Unfortunately the switch to torch.multiprocessing reintroduced the qsize issue on macOS.
is this using pytorch MPS acceleration?
Unfortunately the switch to torch.multiprocessing reintroduced the qsize issue on macOS.
I'll make a fix for this.
is this using pytorch MPS acceleration?
RealtimeSTT depends on the faster-whisper library, which in turn uses CTranslate2. This issue discussion from faster-whisper gh repo says there's no built-in support for AMD, MPS etc accelerations but it's possible to enable these backends by compiling CTranslate2 from the source with the desired backend before installing faster-whisper.
So - if I got this right - this would mean for MPS acceleration you would first compile CTranslate2 with the necessary backend support (MPS enabled). Then proceed with the installation of RealtimeSTT - which installs faster-whisper, but this should not override the manually compiled version of CTranslate2.
Unfortunately the switch to torch.multiprocessing reintroduced the qsize issue on macOS.
Should be fixed with v0.1.12 now.
Yay it's working! Thank you 🫶🏼
Great work. KoljaB! I found the fix after breaking my head for a while in macOSX. replace the multiprocessing Queue with Manager.Queue. It works perfectly.
- I had to settle on python 3.11 for faster-whisper and other dependencies to work
I don't want the wake words and other aspects. So i had to strip out certain aspects of the code. Still it serves my purpose.
Another thing I noticed was the device index. It works without passing, and in my case the mic was on the device 1. took me a while to list channels & identify the right value.
from multiprocessing import Manager manager = Manager() queue = manager.Queue() # ... use the queue ... if queue.qsize() > 0: # Check for elements print("Queue has elements.")
Thank you, this Manager Queue works for me as well
Code:
Error: