how to enable llama3-8b int4 awq models

Hi , I got an auto-awq models (--wbits=4 --groupsize=128),and using command to run the ppl base on gpu card,
--model /home/ubuntu/qllm_v0.2.0_Llama3-8B-Chinese-Chat_q4 --epochs 0 --eval_ppl --wbits 4 --abits 16 --lwc --net llama-7b met an error when parse https://github.com/OpenGVLab/OmniQuant/blob/main/quantize/int_linear.py#L26 seems QuantLinear define not support qweight for autoawq, Please have a check for the args, Thanks!

OpenGVLab / OmniQuant

how to enable llama3-8b int4 awq models #90