Closed dengzheng-cloud closed 5 months ago
will this caused by general.architecture==llama?
If it reports llama
instead of gemma
then something went wrong somewhere
will this caused by general.architecture==llama?
If it reports
llama
instead ofgemma
then something went wrong somewhere
yes, i have checked the config.json, model type is alright, during convert it doesn't output the archi info, but when i quantize the model converted, it info like the screenshot, archi is llama, how could i set archi during convert?
already checked, no business with llama.cpp, close this issue. thx for @ggerganov 's help~ it's very kind of you.
Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.
model_path = 'lmsys/vicuna-7b-v1.5' quant_path = 'vicuna-7b-v1.5-awq' quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }
Load model
model = AutoAWQForCausalLM.from_pretrained(model_path) tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
Quantize
model.quantize(tokenizer, quant_config=quant_config, export_compatible=True)
Save quantized model
model.save_quantized(quant_path) tokenizer.save_pretrained(quant_path)