mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.44k stars 184 forks source link

need help! about auto_scale.scale_fc_fc function #27

Open stary-d opened 1 year ago

stary-d commented 1 year ago

Hello, I would like to apply awq to a GPTBigCodeForCausalLM object and it has a unusual atten like this picture shown: image

I added some necessary implements and finally i got this image

it was caused by there: image

Seems like fc2's scale was apply to both fc1 and fc2 , and because the different shape between my_fc1 and my_fc2 ,the entire progrem was broken here.

It seems to be dividing the weight of the previous layer by the scaler of fc2 instead of dividing the input x of fc2 by the scaler,Right?

How can I fix this error , or could you please tell me why we must apply the scale to fc1 and fc2 ?

Thanks & Regrads

Can I do some change like this? image

fanlw0816 commented 1 year ago

I encountered the same error, how to apply awq to GPTBigCodeForCausalLM, can you give me some advice @kentang-mit @tonylins