Closed abhinandanshahdev closed 1 month ago
same experience in macbook air m3
Those steps worked for me on M1 Pro (16gb)
Open Terminal on your Mac.
Create a new directory for the project:
mkdir ~/moshi_mmx
cd ~/moshi_mmx
Create a Python virtual environment:
python3.12 -m venv .venv
Activate the virtual environment:
source .venv/bin/activate
Clone the Moshi repository:
git clone https://github.com/kyutai-labs/moshi.git
Navigate to the moshi_mlx directory:
cd moshi/moshi_mlx
Install the package in editable mode:
pip install -e .
Open "Script Editor" on your Mac (Applications > Utilities > Script Editor).
Copy and paste the following AppleScript code:
on run
set modelChoice to button returned of (display dialog "Choose a model:" buttons {"8-bit Model", "4-bit Model", "BF16 Model"} default button 1)
set modelParams to ""
if modelChoice is "8-bit Model" then
set modelParams to "-q 8 --hf-repo kyutai/moshika-mlx-q8"
else if modelChoice is "4-bit Model" then
set modelParams to "-q 4 --hf-repo kyutai/moshika-mlx-q4"
else
set modelParams to "--hf-repo kyutai/moshiko-mlx-bf16"
end if
tell application "Terminal"
activate
set currentTab to do script "cd ~/moshi_mmx && source .venv/bin/activate && python -m moshi_mlx.local_web " & modelParams
end tell
-- Wait for the server to start (adjust the delay if needed)
delay 5
-- Open Safari and navigate to the local web interface
tell application "Safari"
activate
open location "http://localhost:8998"
end tell
end run
Save the script:
The web interface will be available at http://localhost:8998
.
http://localhost:8998
in your web browser.To stop the application, close the Terminal window that was opened by the script.
Enjoy using Moshi MLX!
Quick mention: bf16 doesnt work on my 16gb ram machine... i guess its just not enough ram...
Right, moshi-mlx with a 16GB machine would be a bit tight as the weights alone are 15.4GB. Hopefully the q8 version should have an almost similar quality.
Mosshi feels unhinged and i love it haha
have you gotten APIs to work with it? say hosting a conversational AI server locally on a M2 / M1 mac? Talking to the model is fun, but Id love to do a serious use case. Thanks for the help!
actually im facing the problem that after about 2 minutes i reach this point and the server/inference webui crashes in console...
..... ror in encoder thread narrow invalid args start + len > dim_len: [4096, 32], dim: 0, start: 4096, len:2 error in encoder thread narrow invalid args start + len > dim_len: [4096, 32], dim: 0, start: 4096, len:2 error in encoder thread narrow invalid args start + len > dim_len: [4096, 32], dim: 0, start: 4096, len:2 error in encoder thread narrow invalid args start + len > dim_len: [4096, 32], dim: 0, start: 4096, len:2 error in encoder thread narrow invalid args start + len > dim_len: [4096, 32], dim: 0, start: 4096, len:2 error in encoder thread narrow invalid args start + len > dim_len: [4096, 32], dim: 0, start: 4096, len:2 error in encoder thread narrow invalid args start + len > dim_len: [4096, 32], dim: 0, start: 4096, len:2 error in encoder thread narrow invalid args start + len > dim_len: [4096, 32], dim: 0, start: 4096, len:2 error in encoder thread narrow invalid args start + len > dim_len: [4096, 32], dim: 0, start: 4096, len:2 error in encoder thread narrow invalid args start + len > dim_len: [4096, 32], dim: 0, start: 4096, len:2 error in encoder thread narrow invalid args start + len > dim_len: [4096, 32], dim: 0, start: 4096, len:2 [Info] connection closed [Info] done with connection
This last issue is likely reaching out the maximum conversation time, see #51 .
Closing as it's hopefully all good now, feel free to re-open/create a new one if you still encounter some issues.
Due diligence
Topic
The MLX implementation
Question
It shows microphone bars jumping up and down, no command line errors, just no audio out.
[Info] listening to http://localhost:8998 [Info] opening browser at http://localhost:8998 [Info] accepted connection [Info] connection closed [Info] done with connection [Info] accepted connection [Info] connection closed