YOLOv9-t BFLOPs discrepancy

WongKinYiu / yolov9

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

GNU General Public License v3.0

9.04k stars 1.44k forks source link

YOLOv9-t BFLOPs discrepancy #506

Open dbacea opened 5 months ago

dbacea commented 5 months ago

In the official table and research paper, the listed number of parameters and FLOPs for YOLOv9-t equal 2.0M and 7.7G. But when training from scratch: python -m torch.distributed.launch --nproc_per_node 2 --master_port 9527 train_dual.py --workers 4 --device 0,1 --sync-bn --batch 32 --data data/coco.yaml --img 640 --cfg models/detect/yolov9-t.yaml --weights '' --name yolov9-t --hyp hyp.scratch-high.yaml --min-items 0 --epochs 500 --close-mosaic 15

The number of parameters and the BFLOPs printed at the end of training differ from the ones in the official table. The obtained number of parameters equals 3.67M, while the FLOPs equals 16.2G.

Where could this big difference coming from?

ankandrew commented 5 months ago

The YOLOv9-T that appears in the table is the converted one (aux branch is removed). The aux branch won't be used during inference, just during training. For a fair comparison, you need to compare the YOLOv9- (converted) with the corresponding Gelan- network architecture. The one you see being reported seems to include the aux branch, thus more parameters and FLOPS being shown.

dbacea commented 5 months ago

If I understand correctly, the number of parameters and the FLOPs of Gelan-t should be comparable to the ones posted in the official table for YOLOv9-t. I've trained a gelan-t from scratch: python -m torch.distributed.launch --nproc_per_node 2 --master_port 9527 train.py --workers 4 --device 0,1 --sync-bn --batch 32 --data data/coco.yaml --img 640 --cfg models/detect/gelan-t.yaml --weights '' --name gelan-t --hyp hyp.scratch-high.yaml --min-items 0 --epochs 1 --close-mosaic 15

The architecture: Annotation 2024-06-20 161856_gelan_t

Which shows that the number of parameters equals 2442640 and the FLOPs 10.1G.

After training (and layers fusion) the number of parameters equals 2407632 and the FLOPs 9.8G, which still are higher than the ones posted for YOLOv9-t.

Annotation 2024-06-20 161856_gelan_t_post_training

Is there an additional step to be taken?

ankandrew commented 5 months ago

If I understand correctly, the number of parameters and the FLOPs of Gelan-t should be comparable to the ones posted in the official table for YOLOv9-t.

Yes. But in the OG post you compared to the 3.67M number, which corresponds to the one with the aux branch. But yes, YOLOv9-t-converted and Gelan-t numbers don't seem to match with what is in the docs table.

After training (and layers fusion) the number of parameters equals 2407632 and the FLOPs 9.8G, which still are higher than the ones posted for YOLOv9-t.

At this point your question converges to the same as mine, see https://github.com/WongKinYiu/yolov9/issues/505#issue-2363375428. I think the used gelan-t.yaml to parametrize the yolov9-t.yaml is slightly different than the one provided in the configs.

Is there an additional step to be taken?

I'm on the same boat as you, these don't seem to match. Let's wait for author answer.