Closed Pengxiao-Wang closed 2 years ago
Thanks for your reply:
It is said in the paper : "ReLU can spoil sum-shift-invariance" so that " g(y) = y^m m>1" non-linear activation function is proposed so solve that problem.
Isn't it better that we use " g(y) = y^m m>1" rather than ReLU?
While it is true that polynomial activations can restore shift invariance, they are often not good alternatives to ReLU. For example, they do not satisfy the universal approximation theorem (https://www.sciencedirect.com/science/article/abs/pii/S0893608005801315).
Because of this, we do not propose to use polynomials to resolve the problem. Instead we introduce a new downsampling method (called APS) that restores shift invariance in CNN classifiers even in the presence of ReLU.
Yes, the models in this repo use ReLU activations.