Thank you for providing this work.
I wonder that how to use kse in nn.Linear? In your paper, you argue that “ It is generic, and can be used to compress fully-connected layers by treating them as 1 × 1 convolutional layers.”
In your implementation, it seems that you only use KSE for conv_2d?
Thank you for providing this work. I wonder that how to use kse in nn.Linear? In your paper, you argue that “ It is generic, and can be used to compress fully-connected layers by treating them as 1 × 1 convolutional layers.” In your implementation, it seems that you only use KSE for conv_2d?