datawhalechina / thorough-pytorch

PyTorch入门教程,在线阅读地址:https://datawhalechina.github.io/thorough-pytorch/
https://datawhalechina.github.io/thorough-pytorch/
Other
2.55k stars 417 forks source link

optim.zero_grad() #8

Closed YanjiNing closed 2 years ago

YanjiNing commented 2 years ago

optim.zero_grad()这个方法除了能将累积的梯度清零,还有一个作用就是当多个batch只调用一次这个函数时,相当于增大了batch_size,也就是可以将batch大小增大n倍

LiJiaqi96 commented 2 years ago

是的,不如这么理解,调用zero_grad()实现一次梯度清零,不调用的情况下就相当于对多个batch做累积。但增大batch size靠optim.zero_grad()来实现稍有不准确,对于更为复杂的机器学习任务,batch size增大可能会有其他功效,比如BatchNormalization等