Guidance Needed: Quantizing the llava model

Mozilla-Ocho / llamafile

Distribute and run LLMs with a single file.

https://llamafile.ai

Other

19.02k stars 973 forks source link

Guidance Needed: Quantizing the llava model #230

Closed lg123666 closed 7 months ago

lg123666 commented 7 months ago

I'm looking to quantize the llava model from fp16.gguf. When I try to quantize llava after compiled llamafile, app/bin/llava-quantize llava-v1.5-7B-GGUF/llava-v1.5-7b-mmproj-f16.gguf llava-v1.5-7B-GGUF/llava-v1.5-7b-mmproj-q4_0_test.gguf 7 some error occurred llamafile/metal.c:271: assert(FLAG_gpu != LLAMAFILE_GPU_ERROR) failed (cosmoaddr2line app/bin/llava-quantize 45003c 53533d 4500b5 499e19 438e23 43c0e0 401983 401e23 401604)

Could someone provide guidance or steps on how to achive this?

francisco-lafe commented 7 months ago

I suggest grabbing the LLava model from HF: https://huggingface.co/liuhaotian/llava-v1.5-7b/tree/main And doing this: https://www.secondstate.io/articles/convert-pytorch-to-gguf/ https://github.com/ggerganov/llama.cpp/discussions/2948

I recall doing this two commands for another project, this converts the hugging face model to gguf and then you can quantize it: python llama.cpp/convert.py .\Lince-Mistral\ --outfile lince-mistral.gguf .\llamacppbinaries\quantize.exe .\lince-mistral.gguf 15

jart commented 7 months ago

Thanks for reporting this. I'm in the process of fixing this now with the latest upstream sync. Once this issue is closed, you'll be able to build llava-quantize by running make on this repo at head. I'll be publishing a new release shortly afterward.