total numer of iterations

holzbock commented 3 years ago

If train the model with multiple GPU's the total batch size becomes bigger (batch_size_total = batch_size * num_gpus) but the number of eval_steps in one epoch stays equal. This causes that the number of overall iterations in the training is increased by the factor of the number of GPU's. In the original Tensorflow implementation the number of overall iterations is independent from the number of the GPU's and the batch is divided to the different GPU's. I'm not 100% sure about this but if it's right the number of eval_steps in one epoch should be reduced or the batch should be divided to the GPU's so that the number of overall iterations stays constant when using multiple GPU's.

mails2amit commented 3 years ago

I too have the question on number of iteration. I understand that number of eval steps should be calculated dynamically by below formula. number of eval_steps(iterations) = total number of training images/number of images in a batch For 4000 images, number of eval steps = 4000/64 = 63 iteration in each epoch

But I do not achieve the accuracy as mentioned if I define the eval steps as 63 and epoch is 1000.

bryanwong17 commented 1 year ago

May I know how to set coefficient of unlabeled batch size (mu) & eval_steps properly?

kekmodel / FixMatch-pytorch

total numer of iterations #35