How do you keep buffer fixed during gradient steps

ssnl / dataset-distillation

Open-source code for paper "Dataset Distillation"

MIT License

778 stars 115 forks source link

Hello! I've noticed your warning

logging.warn(('{} contains buffer {}. The buffer will be treated as '
                        'a constant and assumed not to change during gradient '
                        'steps. If this assumption is violated (e.g., '
                        'BatchNorm*d\'s running_mean/var), the computation will '
                        'be incorrect.').format(m.__class__.__name__, n))

May I ask how do you keep buffer fixed during gradient steps(e.g. running mean and running var in batchnorm)? In this code there is only LeNet and AlexNet, so this won't be a problem. But I wonder have you done experiment on networks with batchnorm?

Thanks a lot!

ssnl / dataset-distillation

How do you keep buffer fixed during gradient steps #20