Open OohBen opened 1 month ago
i think the #41 and #48 will fix your issue. if you want to a hot fix, you can manually edit what #41 has suggested on your local repo.
If you're interested, I have a MLX fork that runs on my M2 Macbook out the box and with ~48 tok/s. Have implemented server, SSE, and a frontend for changing system prompts etc. still WIP though but caught up to the main branch (frog branch incoming).
@samefarrar In the case of mlx_download_weights.py:20 catching a HTTPError exception with a 403 Forbidden, maybe remove the weights/Llama-3.2-1B-Instruct/ directory, otherwise it will not download it next when you have requested and gotten access.
@samefarrar Can you add the if
before the elif
that you forgot the commit? It lives around here:
The patch probably looks like: https://github.com/samefarrar/entropix_mlx/pull/17
When trying to use this on a mac I get this error (M3 Max): ➜ entropix git:(main) ✗ PYTHONPATH=. poetry run python entropix/torch_main.py Using device: mps <|begin_of_text|><|start_header_id|>system<|end_header_id|>
Which number is larger, 9.9 or 9.11?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
ToTraceback (most recent call last): File "/Users/xxx/shitbox/entropix/entropix/torch_main.py", line 130, in
tyro.cli(main)
File "/Users/xxx/Library/Caches/pypoetry/virtualenvs/entropix-OBaG1pBS-py3.12/lib/python3.12/site-packages/tyro/_cli.py", line 229, in cli
return run_with_args_from_cli()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xxx/shitbox/entropix/entropix/torch_main.py", line 127, in main
generate(xfmr_weights, model_params, raw_tokens1)
File "/Users/xxx/shitbox/entropix/entropix/torch_main.py", line 119, in generate
logits, kvcache, scores, stats = xfmr(xfmr_weights, model_params, next_token, cur_pos, freqs_cis[cur_pos:cur_pos+1], kvcache)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xxx/shitbox/entropix/entropix/torch_model.py", line 75, in xfmr
h_attn, kvcache, scores = attention(norm_x, xfmr_weights.layer_weights[i], model_params, cur_pos, i, freqs_cis, kvcache, attn_mask=attn_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xxx/shitbox/entropix/entropix/torch_model.py", line 50, in attention
scores = torch.matmul(xq, keys)
^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Placeholder storage has not been allocated on MPS device!