Closed cs-heibao closed 6 years ago
@junjieAI Yes, your undestanding is right. The 'iter_size' is used to accumulate images to obtain large batch size with less GPUs.
@sfzhang15 OK,thanks. Another problem: I've tried two different ways(With one GPU,1080Ti). The first is: I Used 'iter_size=4', and batch_size is 8, and the training process is ok; while I set 'iter_size=1', the batch_size can't excess 12. So under the condition with one GPU, as for the first case, we keep batch_size=8 unchanged, and then despite we set 'iter_size=4 or 2, or 1', finally, the actual batch still 8 not the iter_size*8 ? (since, if 'iter_size=1', the batch_size can't excess 12)
@junjieAI if 'iter_size=1' and ‘batch_size=8 with one GPU‘, the tatol batch size = 8 if 'iter_size=1' and ‘batch_size=8 with four GPUs‘, the tatol batch size = 32 if 'iter_size=4' and ‘batch_size=8 with one GPU‘, the tatol batch size = 32
@sfzhang15 hi, during the experiment of your ideas, the parameter 'iter_size' troubled me, I know the definition of 'iter_size', but what it is exactly used for ? and this is my own understanding : if we have one GPU, we get the 'iter_size=4', so the mini-batch for this GPU is 48=32 ? and if we have two GPU ,the 'iter_size=2', so the mini-batch for each GPU is 28=16 ? And your GPUs are 4, so the 'iter_szie=1', mini-batch for four GPUs is 1*8=8 ? It is very appreciated for you to correct me, thanks!