[Feature Request] Add LLaVA-phi-3-mini and quantized version

Blaizzy / mlx-vlm

MLX-VLM is a package for running Vision LLMs locally on your Mac using MLX.

MIT License

144 stars 12 forks source link

[Feature Request] Add LLaVA-phi-3-mini and quantized version #11

Closed s-smits closed 2 months ago

s-smits commented 2 months ago

See title. A q4 version would be great as well. https://huggingface.co/xtuner/llava-phi-3-mini-hf

Blaizzy commented 2 months ago

Thanks a lot for adding it here :)

Quantisation already works. Once the model script is ready you can just run

python -m mlx_vlm.convert \                          
    --hf-path  xtuner/llava-phi-3-mini-hf \
    -q
    --upload-repo mlx-community/llava-phi-3-mini-hf-4bit

Are you interested and adding a PR for this model?

s-smits commented 2 months ago

I can try, quite new to the MLX space so this will take a while.

Blaizzy commented 2 months ago

https://github.com/Blaizzy/mlx-vlm/pull/12 that was fast! 🚀

Blaizzy commented 2 months ago

You can use the already pre-quantized model in the hub:

https://huggingface.co/mlx-community/llava-phi-3-mini-4bit https://huggingface.co/mlx-community/llava-llama-3-8b-v1_1-4bit Just install the latest version: pip install -U mlx-vlm

s-smits commented 2 months ago

Great, thank you!

Blaizzy commented 1 month ago

Most welcome!