Issue About EMA process in code

facebookresearch / unbiased-teacher

PyTorch code for ICLR 2021 paper Unbiased Teacher for Semi-Supervised Object Detection

MIT License

410 stars 82 forks source link

Hi,

Nice work and code! I have a little question about the teacher model in your code. I noticed that only the student model uses DDP. Hence, if I use 4 GPUs to train the model in the DDP method, there is only one teacher model in memory. But there are 4 processes (equal to GPUs) to use EMA to update the teacher model. In other words, in each iteration, the teacher model may be updated 4 times?

I don't found reduce or broadcast operation in the code.

Or maybe Detectron2 has some synchronize operation that I don't know.

Hope to get your reply! Thanks again for your excellent work!

facebookresearch / unbiased-teacher

Issue About EMA process in code #31