achaman2 / truly_shift_invariant_cnns

63 stars 8 forks source link

Still using ReLU in the code ? #6

Closed Pengxiao-Wang closed 2 years ago

achaman2 commented 2 years ago

Yes, the models in this repo use ReLU activations.

Pengxiao-Wang commented 2 years ago

Thanks for your reply:

It is said in the paper : "ReLU can spoil sum-shift-invariance" so that " g(y) = y^m m>1" non-linear activation function is proposed so solve that problem.

Isn't it better that we use " g(y) = y^m m>1" rather than ReLU?

achaman2 commented 2 years ago

While it is true that polynomial activations can restore shift invariance, they are often not good alternatives to ReLU. For example, they do not satisfy the universal approximation theorem (https://www.sciencedirect.com/science/article/abs/pii/S0893608005801315).

Because of this, we do not propose to use polynomials to resolve the problem. Instead we introduce a new downsampling method (called APS) that restores shift invariance in CNN classifiers even in the presence of ReLU.