Hi, thanks for sharing this fancy paper Smoothquant.
I just have a few simple questions about the parameter Alpha. Will appreciate it if you guys can provide more details about it.
How to define the outliers? Is it for per-channel? or calculate among the whole activation tensor?
How to get the ratio of outliers. As you mentioned in the paper, for example, 30% outliers, how do you get such a ratio?
How to get the Alpha scale according to the ratio?
Hi, thanks for sharing this fancy paper Smoothquant. I just have a few simple questions about the parameter Alpha. Will appreciate it if you guys can provide more details about it.
Thank you.