Open SevaSk opened 1 year ago
I would like to try to work on this in my mac :)
hope to start doing some tests later today
for MacOS you can change in python code in AudioRecorder.py, AudioTranscriber.py and custom_speech_recognition/init.py
import pyaudio
#import pyaudiowpatch as pyaudio
to disable windows only library and also add changes to requirements.txt to install pyaudio instead of pyaudiowpatch and some brew :
brew install portaudio
brew install python-tk@3.10
but still have error:
[INFO] Completed ambient noise adjustment for Default Mic.
Traceback (most recent call last):
File "/Users/user/git_repo/ecoute/main.py", line 119, in <module>
main()
File "/Users/user/git_repo/ecoute/main.py", line 76, in main
speaker_audio_recorder = AudioRecorder.DefaultSpeakerRecorder()
File "/Users/user/git_repo/ecoute/AudioRecorder.py", line 38, in __init__
with pyaudio.PyAudio() as p:
Likewise, after installing it on the Mac, I just left the pyaudio library, because it pyaudiowpatch was not installed and the error remained:
[INFO] Adjusting for ambient noise from Default Mic. Please make some noise from the Default Mic...
[INFO] Completed ambient noise adjustment for Default Mic.
Traceback (most recent call last):
File "/Users/alisarussen/Desktop/script/main.py", line 119, in
Help optimize
@slawa-c i tried that workaround but still getting same error on req installation
I'd love seeing this running on mac os 🙏
I did it on my Macbook, but without recognizing the sound from the speakers, only from the microphone. It also works fine if you output sound to an external speaker, but then the application does not distinguish between audio streams. I'm surprised it doesn't automatically detect the Whisper API 🤨
Here's a list of changes I made in addition to the tips above:
AudioRecorder.py
class DefaultSpeakerRecorder(BaseRecorder):
def __init__(self):
# Placeholder for macOS support
pass
AudioTranscriber.py
class AudioTranscriber:
def __init__(self, mic_source, model):
"Speaker": {
"sample_rate": mic_source.SAMPLE_RATE,
"sample_width": mic_source.SAMPLE_WIDTH,
"channels": mic_source.channels,
"last_sample": bytes(),
"last_spoken": None,
"new_phrase": True,
"process_data_func": self.process_speaker_data
}
main.py
transcriber = AudioTranscriber(user_audio_recorder.source, model)
Is there a way I can configure this on my Mac that currently runs on 12.6.6? Also, to configure I would require Xcode (supports 13 & above), any alternatives one could suggest?
I did it on my Macbook, but without recognizing the sound from the speakers, only from the microphone. It also works fine if you output sound to an external speaker, but then the application does not distinguish between audio streams. I'm surprised it doesn't automatically detect the Whisper API 🤨
Here's a list of changes I made in addition to the tips above:
AudioRecorder.py
class DefaultSpeakerRecorder(BaseRecorder): def __init__(self): # Placeholder for macOS support pass
AudioTranscriber.py
class AudioTranscriber: def __init__(self, mic_source, model):
"Speaker": { "sample_rate": mic_source.SAMPLE_RATE, "sample_width": mic_source.SAMPLE_WIDTH, "channels": mic_source.channels, "last_sample": bytes(), "last_spoken": None, "new_phrase": True, "process_data_func": self.process_speaker_data }
main.py
transcriber = AudioTranscriber(user_audio_recorder.source, model)
I had similar code and similar result. So here comes a bigger question: What's the Mac alternative for Windows WASAPI speaker loopback support.
I did some quick research, unfortunately only Windows has decent SDK/API level support for speaker audio loopback.
For Mac, I've read several recommendations (from Stackoverflow) that by installing virtual sound card or mixers, we can loop back audio from output device. Similar solutions on Linux too.
I wonder if anyone could find more elegant solutions, like PyAudio patch or enhancement for this topic on Linux and Mac.
Thanks!
p.s. my code: https://github.com/oldsongsz/ecoute
Everybody with successful tests here. How do you deal with PyAudioWPatch
?
I have a PR for getting it to work on mac
I don't know if it would be helpful, but I wonder if something like Soundflower or it's 'cousin' Loopback would be helpful!
Especially with seeing that it seems like audio can be 'heard' from a mic, but not the speakers, maybe turning the speakers into a mic would be a good thing to explore?
I have a PR for getting it to work on mac
I launch it in my M1 mac, just get a blank pannel
DEPRECATION WARNING: The system version of Tk is deprecated and may be removed in a future release. Please don't rely on it. Set TK_SILENCE_DEPRECATION=1 to suppress this warning. [INFO] Adjusting for ambient noise from Default Mic. Please make some noise from the Default Mic... [INFO] Completed ambient noise adjustment for Default Mic. [INFO] Adjusting for ambient noise from Default Speaker. Please make or play some noise from the Default Speaker... [INFO] Completed ambient noise adjustment for Default Speaker. READY
I have a PR for getting it to work on mac
I launch it in my M1 mac, just get a blank pannel
DEPRECATION WARNING: The system version of Tk is deprecated and may be removed in a future release. Please don't rely on it. Set TK_SILENCE_DEPRECATION=1 to suppress this warning. [INFO] Adjusting for ambient noise from Default Mic. Please make some noise from the Default Mic... [INFO] Completed ambient noise adjustment for Default Mic. [INFO] Adjusting for ambient noise from Default Speaker. Please make or play some noise from the Default Speaker... [INFO] Completed ambient noise adjustment for Default Speaker. READY
you may need install python3.10 & brew install python-tk@3.10
& pip install tk
Hi Laisky, thanks a lot for your quick response. I did all things you suggested, still have the same issue. Python: 3.10.3 python-tk@3.10 tk-0.1.0
also I'm getting this error for pip install -r requirements.txt
on python 3.11.3
Building wheel for pyaudio (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for pyaudio (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [18 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build/lib.macosx-12-x86_64-cpython-311
creating build/lib.macosx-12-x86_64-cpython-311/pyaudio
copying src/pyaudio/__init__.py -> build/lib.macosx-12-x86_64-cpython-311/pyaudio
running build_ext
building 'pyaudio._portaudio' extension
creating build/temp.macosx-12-x86_64-cpython-311
creating build/temp.macosx-12-x86_64-cpython-311/src
creating build/temp.macosx-12-x86_64-cpython-311/src/pyaudio
clang -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX12.sdk -DMACOS=1 -I/usr/local/include -I/usr/include -I/opt/homebrew/include -I/Users/alireza/github/ecoute/env/include -I/usr/local/Cellar/python@3.11/3.11.3/Frameworks/Python.framework/Versions/3.11/include/python3.11 -c src/pyaudio/device_api.c -o build/temp.macosx-12-x86_64-cpython-311/src/pyaudio/device_api.o
src/pyaudio/device_api.c:9:10: fatal error: 'portaudio.h' file not found
#include "portaudio.h"
^~~~~~~~~~~~~
1 error generated.
error: command '/usr/bin/clang' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for pyaudio
Building wheel for future (setup.py) ... done
Created wheel for future: filename=future-0.18.3-py3-none-any.whl size=492022 sha256=d52c95c1be769f66c49f623e6fabc3c95421baaa0de107479330dc3c1617de61
Stored in directory: /Users/alireza/Library/Caches/pip/wheels/da/19/ca/9d8c44cd311a955509d7e13da3f0bea42400c469ef825b580b
Successfully built openai-whisper Wave future
Failed to build pyaudio
ERROR: Could not build wheels for pyaudio, which is required to install pyproject.toml-based projects
brew install portaudio
fixes the issue
any one want to try it on mac, can check this https://github.com/clevertang/ecoute
I used https://existential.audio/blackhole/ for collecting audio and selecting which channels I want to record
any one want to try it on mac, can check this https://github.com/clevertang/ecoute
So far seem to be making the most progress on this one. One of the things I've picked up that you can maybe add to documentation:
You need to include an entry in the keys.py file denoting the device name. I found my device name by executing "system_profiler SPAudioDataType" in a terminal. My keys.py file has an entry now: DEVICE_NAME="MacBook Pro Microphone"
I used https://existential.audio/blackhole/ for collecting audio and selecting which channels I want to record
And, hopefully this helps someone else on a Mac once day. If you're getting to the point where the interface is starting, but you're getting a blank window.. You're close and it's probably your version of Python that is outdated. I eventually settled on 3.11.1. In order to get there:
pip freeze > requirements-lock.txt
pyenv virtualenv-delete ecoute
pyenv install 3.11.1
pyenv virtualenv 3.11.1 ecoute
pyenv activate ecoute
pip install -r requirements-lock.txt
The reason this code does not run on macOS is because it relies on the PyAudioWPatch library for recording sound from speakers. This library specifically utilizes the Windows Audio Session API (WASAPI), enabling the use of output devices that support this API in loopback mode.
To make the code work on macOS, we would need a comparable feature that can record sound from speakers. The primary modifications would be required in the DefaultSpeakerRecorder class, where we currently use WASAPI to create the audio stream for recording. A potential solution could be leveraging tools like BlackHole to achieve similar functionality on macOS.