I am going to make an experiments to quantize opt model family, want to using the smoothquant algorithm, but because there is an activation function between the fc1 and fc2, how to handle fc2.
Also, why the code in the repository doesnot quantize the out_proj module?
I am going to make an experiments to quantize opt model family, want to using the smoothquant algorithm, but because there is an activation function between the fc1 and fc2, how to handle fc2. Also, why the code in the repository doesnot quantize the out_proj module?