PaddlePaddle / PaddleYOLO

🚀🚀🚀 YOLO series of PaddlePaddle implementation, PP-YOLOE+, RT-DETR, YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOv10, YOLOX, YOLOv5u, YOLOv7u, YOLOv6Lite, RTMDet and so on. 🚀🚀🚀
https://github.com/PaddlePaddle/PaddleYOLO
GNU General Public License v3.0
558 stars 134 forks source link

add extra all_reduce for xpu #97

Closed shengxiangwang closed 1 year ago

shengxiangwang commented 1 year ago

XPU多卡训练过程中,某些卡计算完成后调用all reduce 等待其他卡的时间超过默认阈值。 这里在前向和后向之间,添加一个额外的all_reduce 操作,以减少all reduce 超时的出现。