When I use omniquant to quantizate OPT-30B to w6a6, an error happens in omniquant.py:
scale = (act.pow(args.alpha)/weight.pow(1-args.alpha)).clamp(min=1e-5)
"
RuntimeError: The size of tensor a (7168) must match the size of tensor b (5120) at non-singleton dimension 0
"
I find the shape of act is [7168], but the shape of weight is [5120].
" CUDA_VISIBLE_DEVICES=0 python main.py \ --model /home/Projects/model_zoo/facebook/opt-30b \ --epochs 20 --output_dir ./log/opt-30b-w6a6 \ --wbits 6 --abits 6 --lwc --let --alpha 0.75 --eval_ppl \ --net opt-30b "
When I use omniquant to quantizate OPT-30B to w6a6, an error happens in omniquant.py: scale = (act.pow(args.alpha)/weight.pow(1-args.alpha)).clamp(min=1e-5)
" RuntimeError: The size of tensor a (7168) must match the size of tensor b (5120) at non-singleton dimension 0 "
I find the shape of act is [7168], but the shape of weight is [5120].