Open kzleong opened 5 months ago
What is the architecture of your custom model? Do you know if it is already supported by AWQ?
I'm trying to quantize a LLaVA 1.5 13b model finetuned with LoRA, and I've already converted the bin files to safetensors
The error seems to be solved, it was an issue in my special_tokens_map.json where for some reason the special keyword was already declared, so I just removed it
However now I have the following error with model.quantize(tokenizer, quant_config=quant_config)
Traceback (most recent call last):
File "/mainfs/lyceum/kzl1m20/LLaVA/quant.py", line 13, in <module>
model.quantize(tokenizer, quant_config=quant_config)
File "/lyceum/kzl1m20/miniconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'LlavaForConditionalGeneration' object has no attribute 'quantize'
Any ideas why? I was able to get quantized weights for this model using llm-awq however I want a quantized model as others have using AutoAWQ
hey bro, did you figure out this error? I met the same output with
AttributeError: 'LlamaForCausalLM' object has no attribute 'quantize'
my code is under:
`from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
import torch
model_path = '/home/evgpu/LLM_quantize/AutoCoder_S_6.7B' quant_path = '/home/evgpu/LLM_quantize/AutoCoder_S_6.7B-AWQ-4B' quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = AutoAWQForCausalLM.from_pretrained(model_path,device_map='cuda').to(device)
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model.quantize(tokenizer, quant_config=quant_config)
model.save_quantized(quant_path) tokenizer.save_pretrained(quant_path) `
Hi @casper-hansen, I keep getting this error when trying to quantize my custom LLaVA model:
The script (quant.py) I'm using is below:
I'm using the following packages: autoawq=0.2.4 transformers=4.38.2 tokenziers=0.15.2