lukasruff / Deep-SVDD-PyTorch

A PyTorch implementation of the Deep SVDD anomaly detection method
MIT License
698 stars 197 forks source link

Why do not update batchnorm mean and var during training? #31

Open RichardChangCA opened 2 years ago

RichardChangCA commented 2 years ago

Hello, Thanks for your source codes.

Could I ask Why do not update batchnorm mean and var during training? affine=False means do not update batch normalization parameters.

Thanks

Callidior commented 2 years ago

affine=False means that the normalization step of batch norm will not be followed by a linear scaling and offset of the form a * x + b. The argument affine only relates to these affine learnable parameters, not to mean and standard deviation of the normalization, which are still learned.

For Deep SVDD, it is crucial to disable these affine transformations, as stated in section 3.3 of the paper:

Put differently, Proposition 2 implies that networks with bias terms can easily learn any constant function, which is independent of the input x ∈ X . It follows that bias terms should not be used in neural networks with Deep SVDD since the network can learn the constant function mapping directly to the hypersphere center, leading to hypersphere collapse.

Intuitively, if your network contains a bias term, the last layer could just learn to set all weights to zero and the bias to the center c, mapping everything to the center without even taking the input data into account.