facebookresearch / unbiased-teacher

PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection
https://arxiv.org/abs/2102.09480
MIT License
410 stars 82 forks source link

Issue About EMA process in code #31

Closed Yuuuuuuuuuuuuuuuuuummy closed 2 years ago

Yuuuuuuuuuuuuuuuuuummy commented 3 years ago

Hi,

Nice work and code! I have a little question about the teacher model in your code. I noticed that only the student model uses DDP. Hence, if I use 4 GPUs to train the model in the DDP method, there is only one teacher model in memory. But there are 4 processes (equal to GPUs) to use EMA to update the teacher model. In other words, in each iteration, the teacher model may be updated 4 times?

I don't found reduce or broadcast operation in the code.

Or maybe Detectron2 has some synchronize operation that I don't know.

Hope to get your reply! Thanks again for your excellent work!

ycliu93 commented 2 years ago

Yes, we only apply DDP on student model, since the teacher is only used for inference. Model only used in inference mode is not allowed to use DDP, so we choose not to apply on teacher model.

Thanks