Open sammcj opened 2 months ago
I can't find a way to get compression working with MLX / Apple Silicon.
When Airllm uses bitsandbytes it seems to want to load CUDA rather than the installed CPU/MLX version.
Looking into Bitsandbytes - there is a rewrite effort ongoing to make it work with Apple Silicon (https://github.com/bitsandbytes-foundation/bitsandbytes/issues/252 & https://github.com/huggingface/transformers/pull/31098).
Is there a method other than bitsandbytes available?
I can't find a way to get compression working with MLX / Apple Silicon.
When Airllm uses bitsandbytes it seems to want to load CUDA rather than the installed CPU/MLX version.
Looking into Bitsandbytes - there is a rewrite effort ongoing to make it work with Apple Silicon (https://github.com/bitsandbytes-foundation/bitsandbytes/issues/252 & https://github.com/huggingface/transformers/pull/31098).
Is there a method other than bitsandbytes available?