Closed zw615 closed 5 years ago
Hi, the code provided currently does not support batch norm. You can implement batch norm support by either (1) using batch norm always in eval mode (track_running_stats=False
) or (2) adding code to track and add autograd graphs for buffers.
Hello! I've noticed your warning
May I ask how do you keep buffer fixed during gradient steps(e.g. running mean and running var in batchnorm)? In this code there is only LeNet and AlexNet, so this won't be a problem. But I wonder have you done experiment on networks with batchnorm?
Thanks a lot!