Hi,
There is an assert statement which prevents us from backpropogating through a batch normalization error when the model is in eval mode. Is there any reason for this? In pytorch, there is no such restriction. Keeping the statistics of the batchnorm layer fixed while computing gradients should not be an issue in my opinion
Hi, There is an assert statement which prevents us from backpropogating through a batch normalization error when the model is in eval mode. Is there any reason for this? In pytorch, there is no such restriction. Keeping the statistics of the batchnorm layer fixed while computing gradients should not be an issue in my opinion