PaddlePaddle / PaddleYOLO

🚀🚀🚀 YOLO series of PaddlePaddle implementation, PP-YOLOE+, RT-DETR, YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOX, YOLOv5u, YOLOv7u, YOLOv6Lite, RTMDet and so on. 🚀🚀🚀
https://github.com/PaddlePaddle/PaddleYOLO
GNU General Public License v3.0
534 stars 132 forks source link

ppyoloe的8卡长训结果和官方对不齐 #204

Closed Pooky-Z closed 7 months ago

Pooky-Z commented 7 months ago

问题确认 Search before asking

请提出你的问题 Please ask your question

使用八张A100跑官方源码,数据集为COCO。每张卡的batch_size设置为8,与官方一致。当前第74轮的mAP为0.196,与官方给出的80轮的52.9%有较大差距。 image 执行脚本为: `model_name=ppyoloe # 可修改,如 yolov7 job_name=ppyoloe_plus_crn_l_80e_coco # 可修改,如 yolov7_tiny_300e_coco

config=configs/${model_name}/${job_name}.yml log_dir=log_dir/${job_name} pretrain_weights="" data_path=/coco/ python -m paddle.distributed.launch --log_dir=${log_dir} --gpus 0,1,2,3,4,5,6,7 tools/train.py \ -c ${config} --eval --amp \ -o TrainDataset.dataset_dir=${data_path} \ EvalDataset.dataset_dir=${data_path} \ TestDataset.dataset_dir=${data_path} \ pretrain_weights=${pretrain_weights} \ snapshot_epoch=1 \ TrainReader.batch_size=8 \ norm_type=sync_bn \ log_iter=1 `

nemonameless commented 7 months ago

ppyoloe_plus_crn_l_80e_coco.yml 是加载obj365预训练权重的,前期不太可能这么低精度,请自查加载预训练是否正确。 提供的训练信息太少不便排查,请说明paddle版本和环境信息,建议再放下loss曲线和第0个epoch前期的log。