How to quantize to gguf using llama.cpp correctly

bilibili / Index-1.9B

A SOTA lightweight multilingual LLM

Apache License 2.0

907 stars 48 forks source link

How to quantize to gguf using llama.cpp correctly #29

Open snowyu opened 3 months ago

snowyu commented 3 months ago

@asirgogogo I tried convert_hf_to_gguf.py but get errror "ERROR:hf-to-gguf:Model IndexForCausalLM is not supported". The old examples/convert_legacy_llama.py can convert to gguf. but this gguf output meaningless repeated characters only.

saber258 commented 3 weeks ago

llama.cpp need some revision related the model structure in convert.py......So you have to add the revision by yourself or look forward to opening coding of revised convert.py by the researchers of index-1.9B.