Questions about performance by batch size

DeepHM commented 1 year ago

Hello. First of all, thank you for sharing your wonderful research!

I have some questions. I am comparing your study with the CPS study, which is a TOP3 benchmark on semi-supervised semantic segmentation (pascal VOC dataset). According to your paper and code, it seems to be done with batch-size = gpus(4)*batch_size(8) = 32. However, according to the paper and code of CPS, it was done with labeled_data: 8 batch & unlabeled_data: 8 batch. Therefore, to ensure a fair comparison, I kept all other options unchanged and trained your code with a resnet50 model, batch-size of 8 (which is equal to the product of 2 GPUs and 4 batch-size), and 80 epochs for all labeled ratios. Also, the implementation of CPS was also trained by setting the epochs to 80 for all labeled ratios.

Below is the result of my re-implementation. (CPS vs PS-MT)

PS-MT

	1/8	1/4	1/2
Epoch80 Score	73.74	72.86	75.77
BEST Score	74.07	74.39	75.80
BEST Epoch	58	28 or 80	80

CPS

	1/8	1/4	1/2
Epoch80 Score	74.09	75.406	75.651
BEST Score	74.937	75.557	75.786
BEST Epoch	67	66	78

This result shows the difference with the results of your study. Also, I think the batch-size in semi-supervised learning is much more important than the importance of batch-size in supervised-learning.

I have a few questions to ask

Is there no problem with the results of my re-implementation described above?
I would like to ask your opinion on the importance of batch-size in semi-supervised learning(semi-supervised semantic segmentatio). Also, I would appreciate it if you could let me know if there is anything I can refer to.

yyliu01 commented 1 year ago

Hi @DeepHM ,

First of all, CPS also utilise multi GPUs for training, where 8 batch is only for 1 GPU. They utilise 4 GPUs for VOC training, and 8 GPUs for Cityscapes training. Please see their provided training logs.

If you don't have enough hardware resources (refer to your questions), since you have to decrease the batch size, did you fine-tune (i.e., decrease) the learning rate? Given that a larger batch size can provide faster convergence, did you enlarge the training epochs when you decrease the batch size?

I believe reduce half of the batch size will not effect the final performance too much, but I can't guarantee the performance if you only utilise 8 batch size for training...

Cheers, Yuyuan

yyliu01 commented 1 year ago

I'm closing the issue. Please feel free to reopen it if you can't achieve the reported performance based on our setting.

yyliu01 / PS-MT

Questions about performance by batch size #24