Open LuChengTHU opened 5 years ago
The moving average has a significant influence with batch normalization, as it simply ignores the learned weights from the BN when copying the model. Can you try to deactivate the moving average in the code and try again? Also note that our theoretical results are local in nature and we therefore cannot guarantee global convergence for every possible architecture you come up with :-) . However, from your image it appears that the algorithm is doing something sensible.
Hello, I just added BN in resblock and run 'python train.py configs/celebA-HQ', then it cannot converge at all.(see the picture below)
I noticed the moving average in your training, but I think its influence to BN is quite small. I cannot understand the reason why adding BN cannot converge. Looking forward to your reply!