how does iter_size work?

isht7 / pytorch-deeplab-resnet

DeepLab resnet v2 model in pytorch

MIT License

602 stars 118 forks source link

how does iter_size work? #34

Closed wppply closed 6 years ago

wppply commented 6 years ago

In the readme.md, you claims that the iter_size option could increase the batchsize to batchsize*iter_size.

However, I did not see any code about this except divide the loss by iter_size.

Could you give more details? thanks

isht7 commented 6 years ago

Gradient is accumulated for iter_size iterations and then applied. So, gradients are applied only once after iter_sizeiterations. The division you are referring to is to average the accumulated gradients.