Closed see2023 closed 6 months ago
debug awq/models/qwen2.py看到没有input_feat["self_attn.q_proj"],替换成了 input_feat["self_attn.q_proj.base_layer"] 继续跑,仍然错误:
[/usr/local/lib/python3.10/dist-packages/awq/quantize/scale.py](https://localhost:8080/#) in apply_scale(module, scales_list, input_feat_dict)
66
67 else:
---> 68 raise NotImplementedError(f"prev_op {type(prev_op)} not supported yet!")
69
70 # apply the scaling to input feat if given; prepare it for clipping
NotImplementedError: prev_op <class 'peft.tuners.lora.layer.Linear'> not supported yet!
Hi, to quantize the LoRA finetuned models, you need to merge the adapters first. Please refer to the peft documentation for guidance on that:
微调参考的是 https://qwen.readthedocs.io/zh-cn/latest/training/SFT/example.html 使用jsonl格式。
量化参考的是https://qwen.readthedocs.io/zh-cn/latest/quantization/awq.html
量化时AutoAWQForCausalLM的model加载提示没有 config.json,用AutoModelForCausalLM的config保存:
再执行:
执行到model.quantize()时在这里错误: /usr/local/lib/python3.10/dist-packages/awq/models/qwen2.py
KeyError: 'self_attn.q_proj' 请问这是什么原因,或者说微调后量化应该是怎样的流程呢?感谢!