Closed Traxmaxx closed 2 weeks ago
The TTS Engine installation has been improved by incorporating MacOS-specific instructions and adjusting configurations for cross-platform compatibility. Changes include updates to .gitignore
for file exclusions, modifications in README.md
for specific components, and a significant transition in glados/tts.py
from CUDA to non-CUDA operations.
File | Change Summary |
---|---|
.gitignore |
Excluded files updated: *.gguf , glados_config.yml . |
README.md |
Added MacOS installation instructions; revised compilation steps for llama.cpp and whisper.cpp ; adjusted USE_CUDA setting guidance. |
glados/tts.py |
Switched USE_CUDA to False ; modified library loading for MacOS compatibility. |
🐰🎉
In the meadow of code, under the silicon sky,
A rabbit hopped by, with a twinkle in its eye.
"A tweak here, a fix there," it cheerfully said,
As it adjusted the settings and compiled the thread.
Now MacOS can sing, with a voice clear and bright—
Thanks to the rabbit, who coded all night! 🌙
🎉🐰
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
@coderabbitai review
@dnhkng how strict do you want me to be with the coderabbit comments about the README in this PR?
yeah.... coderabbit can get a bit silly sometimes. And it can be wrong. Just ignore what doesn't seem useful :+1:
How fast is the inference speeds? I did a test on my M2 MacBook, and it was unusably slow with Llama3 8B. I also ran into a lot of issues where the generated voice was detected as speech.
How fast is the inference speeds? I did a test on my M2 MacBook, and it was unusably slow with Llama3 8B. I also ran into a lot of issues where the generated voice was detected as speech.
Hey 👋
I run llama.cpp with metrics and it reports 17.4825t/s
on an M2 Air with 16GB RAM with the Meta-Llama-3-8B-Instruct-IQ3_XS.gguf
model.
Since I usually run the LLM on a server and not locally, this was not a concern for me at the Moment. Better Macs should give better performance. Also smaller models should work better for slower machines (Phi 3 mini is twice as fast for me for example)
I also had the generated voice detection issue and needed to lower the mic volume by a lot in MIDI settings (also typing triggers the voice detection 🙄 😅)
I usually am plugged into a Native Instruments Komplete Audio 2 with external Mic and Speakers, but since the refactoring, I receive this error after startup:
2024-05-06 19:34:54.801 | SUCCESS | __main__:__init__:134 - TTS text: All neural network modules are now loaded. No network access detected. How very annoying. System Operational.
2024-05-06 19:34:54.911 | SUCCESS | __main__:start_listen_event_loop:183 - Audio Modules Operational
2024-05-06 19:34:54.911 | SUCCESS | __main__:start_listen_event_loop:184 - Listening...
||PaMacCore (AUHAL)|| Error on line 2523: err='-50', msg=Unknown Error
Still investigating... btw, you can reach me in your Discord under the same name, if that's preferred!
Is this still compatible with the new espeak-ng binary changes? Maybe this resolves the mac issues?
Is this still compatible with the new espeak-ng binary changes? Maybe this resolves the mac issues?
I will have a look later today. Thanks for the heads-up!
@dnhkng works with your latest changes. There is just the small README change needed to run WHISPER_COREML=1 WHISPER_METAL_EMBED_LIBRARY=ON make libwhisper.so -j
otherwise it crashes with common-metal.h not found.
I also needed to install a CoreML model of ggml-medium-32-2.en.bin
. Will create a PR with updated README in a bit.
Closing because required MacOS fixes got implemented upstream.
Heya,
thanks for the recent improvements! I did some adjustments to make it run on MacOS. GlaDOS starts, talks to me and speech recognition also works.
I still run into #16 every now and then though 🤔
Summary by CodeRabbit