Open FFAMax opened 1 day ago
Are you using tinygrad?
Are you using tinygrad?
Yes. That's a linux machine there therefore TinygradDynamicShardInferenceEngine picked up.
You cant run llama 3.2 with tinygrad yet, currently it only support llama 3.1 and llama 3
You cant run llama 3.2 with tinygrad yet, currently it only support llama 3.1 and llama 3
what about exo --inference-engine mlx ?
I have errors using both on a linux server.
Does mlx have some specific requirements regarding the system (e.g. a GPU installed)?
uhh mlx is for Apple only, Apple silicon not even intel Apple. MLX Supports every model and those model are quantized in 4bit, model for linux currently is only Llama 3 and 3.1 8b 70b in fp32
It's trying load and never completed
Final