How to quantize the out_proj and fc2 module in OPT model family

mit-han-lab / smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

https://arxiv.org/abs/2211.10438

MIT License

1.26k stars 150 forks source link

How to quantize the out_proj and fc2 module in OPT model family #93

Open yanchenmochen opened 3 months ago

yanchenmochen commented 3 months ago

I am going to make an experiments to quantize opt model family, want to using the smoothquant algorithm, but because there is an activation function between the fc1 and fc2, how to handle fc2. Also, why the code in the repository doesnot quantize the out_proj module？