tianzhi0549 / FCOS

FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)
https://arxiv.org/abs/1904.01355
Other
3.27k stars 630 forks source link

Question about FCOS loss in distributed multiprocess mode #284

Open kikoaumond opened 4 years ago

kikoaumond commented 4 years ago

Hi, I am running a model with FCOS as detector using torch.distributed I see in https://github.com/tianzhi0549/FCOS/blob/master/fcos_core/modeling/rpn/fcos/loss.py that torch.distributed.all_reduce is used to aggregate the centerness loss, in https://github.com/tianzhi0549/FCOS/blob/dd7bfba8c4269ce2930a4a588a907666b970690e/fcos_core/modeling/rpn/fcos/loss.py#L279

But I don't see the same being done for reg_loss and cls_loss. Shouldn't they also be aggregated with all_reduce when being run in multi-process mode?

Thank you