facebookresearch / unbiased-teacher

PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection
https://arxiv.org/abs/2102.09480
MIT License
409 stars 84 forks source link

Loss_box_reg starts at zero and increases during training #37

Closed ogamache closed 2 years ago

ogamache commented 2 years ago

Hello, really nice work here!

I am trying to train your network but I obtain weird behavior related to the _Loss_boxreg and the _Loss_box_regpseudo. The losses start around zero and then increase instead of starting at something around 0.4 like in your article. Here is an example of the results I obtain:

Screenshot from 2021-08-07 12-39-30

The problem starts at the beginning of the BURN_IN stage and when the unsupervised learning starts.

Since I only have access to 1 GPU, I am using smaller batch size (label: 4 images, unlabel: 4 images). I tried to reduce the learning rate and also to reduce the _UNSUP_LOSSWEIGHT from 4 to 2. Those modifications didn't help.

Do you have an idea why this weird behavior happens? Thanks a lot!

ycliu93 commented 2 years ago

There are two major reasons why this trend occurs.

  1. The regression loss is only computed for these foreground boxes, so we will get very few pseudo-box at the beginning of the mutual learning stage. As the pseudo-boxes increase, the regression loss might increases.

  2. We actually do not apply unsupervised regression loss in the training, since we found it degraded the performance when it is applied (pseudo-boxes selected by the classification score do not necessarily have accurate box location for student's learning).

I believe it is not relevant to the amount of GPUs you use. One thing I am curious about is whether you can get better Box AP results even if the regression loss for unsupervised data increases?

Lydiagugugaga commented 2 years ago

@ycliu93 Thanks for your great work! Also I have the same problem: 1 GPU, using smaller batch size (label: 6 , unlabel: 6 ).
Finally, I get worse Box AP results. AP=10 (for coco standard, 1%sup) The Loss_box_reg and the Loss_box_reg_pseudo could not be declined, and the total loss is about 0.4 to 0.5 all the time.