What about the affine of batch normalization is False?

larenzhang commented 3 years ago

In my application case, the 'affine' of batch normalization is False. That means that the bias and weight of batch normalization is None. However, It got the following error:

I guess is caused by wrong statistics of bias and bias gradient. Even I can add the bias -m/s of bn to the bias list, but there are no gradients of this term. Are there some methods to fix this error?

Looking forward to hearing from you.

suraj-srinivas commented 3 years ago

Hi! The codebase also handles the case where batchnorm bias and weights are None. Which model are you using? For the standard models I tried, I do not get this issue. Does the model only use ReLU activations?

However, I have also sometimes come across this issue for some unusual models, which I think are errors introduced due to the finite-precision of the gradients. Essentially, batch-norm ensures that all activations and hence gradients are properly scaled, thus resulting in reduced finite-precision error. However, this is only a hypothesis and I haven't verified this yet.

suraj-srinivas commented 3 years ago

I'm closing this issue because of inactivity. Feel free to reopen if you want to continue discussions. It would also help if you can provide a minimal example where I can reproduce this.

idiap / fullgrad-saliency

What about the affine of batch normalization is False? #9