Open rtadewald opened 4 months ago
Two possible solutions:
Hello, @AllentDan, thanks for your answer.
Unfortunately none of the proposed solutions worked for me. This getting the Assertion Error here:
File "/home/rtadewald/miniconda3/envs/lmdeploy/lib/python3.10/site-packages/lmdeploy/lite/quantization/awq.py", line 118, in smooth_ln_fcs assert torch.isnan(p).sum() == 0 AssertionError
Sorry for the late replay, I can do awq normally through:
lite auto_awq llava-v1.6-34b --work-dir llava-v1.6-34b-awq --calib-seqlen 512 --calib-dataset pileval
The NAN value was from act_scale, which I suppose it might be related to a calibration prompt that can generate a NAN value inside the layer.
Checklist
Describe the bug
Hello Guys. I'm having trouble trying to quantize the llava-v1.6-34b model here, following this tutorial.
I've successfully quantized smaller models (liuhaotian/llava-v1.6-vicuna-7b and liuhaotian/llava-v1.6-vicuna-13b) but when I try with the 34B, got the following error:
Reproduction
export HF_MODEL=liuhaotian/llava-v1.6-34b export WORK_DIR=quantized_vlms/llava-v1.6-34b-4bit lmdeploy lite auto_awq $HF_MODEL --work-dir $WORK_DIR
Environment
Error traceback