bioinf-jku / SNNs

Tutorials and implementations for "Self-normalizing networks"
GNU General Public License v3.0
1.58k stars 199 forks source link

Effect of bias in linear layers #16

Closed ptrcarta closed 3 years ago

ptrcarta commented 3 years ago

I've been experimenting with SELUs, and found they provide an improvement in terms of computation time during training with respect to batchnorm, thank you for your work.

I just have a question regarding the effect of bias in linear layers. As I understand it, every neuron should have mean zero in order to stay in the self regularizing zone, but bias precisely shifts that mean. In my experiments however I didn't see much of an effect either removing or adding biases. I see that in the tutorial notebook bias is used, and I wonder wether you've considered the issue.

gklambauer commented 3 years ago

Dear ptrcarta, thanks, good point! We have experimented a lot with SNNs with and without bias units. In wide networks they hardly play a role. My hypothesis is that it is due to the following: a) SELUs counter the bias shift well and keep activations close to zero mean which is good for learning and b) in wide layers, any unit can learn to take the role of a bias unit. However, at the output layer, bias units can help especially if you have unbalanced data. Hope this helps!