optim.zero_grad() - Githubissues

datawhalechina / thorough-pytorch

PyTorch入门教程，在线阅读地址：https://datawhalechina.github.io/thorough-pytorch/

https://datawhalechina.github.io/thorough-pytorch/

Other

2.55k stars 417 forks source link

optim.zero_grad() #8

Closed YanjiNing closed 2 years ago

YanjiNing commented 2 years ago

optim.zero_grad()这个方法除了能将累积的梯度清零，还有一个作用就是当多个batch只调用一次这个函数时，相当于增大了batch_size，也就是可以将batch大小增大n倍

LiJiaqi96 commented 2 years ago

是的，不如这么理解，调用zero_grad()实现一次梯度清零，不调用的情况下就相当于对多个batch做累积。但增大batch size靠optim.zero_grad()来实现稍有不准确，对于更为复杂的机器学习任务，batch size增大可能会有其他功效，比如BatchNormalization等