DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement.
[X] I have pulled the latest code of main branch to run again and the bug still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
[X] I have read the README carefully and no error occured during the installation process. (Otherwise, we recommand that you can ask a question using the Question template) 我已经仔细阅读了README上的操作指引,并且在安装过程中没有错误发生。(否则,我们建议您使用Question模板向我们进行提问)
Search before reporting
[X] I have searched the DAMO-YOLO issues and found no similar bugs. 我已经在issue列表中搜索但是没有发现类似的bug报告。
OS
Ubuntu 24.04
Device
RTX 4090
CUDA version
12.5
TensorRT version
No response
Python version
3.10
PyTorch version
1.13.1
torchvision version
0.14.1
Describe the bug
I have been attempting to train and export a tinynasL25_s model on the COCO dataset, but getting terrible results from the exported model. I have exported the model end2end, which if I am interpreting the code correctly should have given me at most 100 detections after NMS, but in many cases I am still getting 1000+ detections. Detections do however appear to be filtered by a minimum confidence score of 0.05.
After 60 epochs of training on the COCO dataset I get an evaluation score of:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.365
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.514
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.394
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.208
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.403
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.488
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.320
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.549
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.613
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.424
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.673
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.771
I export this model using:
python tools/converter.py -f configs/damoyolo_tinynasL25_S.py -c workdirs/damoyolo_tinynasL25_S/epoch_60_ckpt.pth --batch_size 1 --img_size 640 --end2end --ort
But when evaluating the exported model on COCO I get the following results:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.008
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.017
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.003
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.047
I have debugged the pre-processing and post-processing steps I have added for running the ONNX model and it looks consistent with the demo script. I am using the same COCO validation scripts for YOLOX and RT-DETR models and have no issues with those. Looks to me like something must be wrong with the export script.
To Reproduce
Train model using configs/damoyolo_tinynasL25_S.py
Export model to ONNX using: python tools/converter.py -f configs/damoyolo_tinynasL25_S.py -c workdirs/damoyolo_tinynasL25_S/epoch_60_ckpt.pth --batch_size 1 --img_size 640 --end2end --ort
Before Reporting
[X] I have pulled the latest code of main branch to run again and the bug still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
[X] I have read the README carefully and no error occured during the installation process. (Otherwise, we recommand that you can ask a question using the Question template) 我已经仔细阅读了README上的操作指引,并且在安装过程中没有错误发生。(否则,我们建议您使用Question模板向我们进行提问)
Search before reporting
OS
Ubuntu 24.04
Device
RTX 4090
CUDA version
12.5
TensorRT version
No response
Python version
3.10
PyTorch version
1.13.1
torchvision version
0.14.1
Describe the bug
I have been attempting to train and export a tinynasL25_s model on the COCO dataset, but getting terrible results from the exported model. I have exported the model end2end, which if I am interpreting the code correctly should have given me at most 100 detections after NMS, but in many cases I am still getting 1000+ detections. Detections do however appear to be filtered by a minimum confidence score of 0.05.
After 60 epochs of training on the COCO dataset I get an evaluation score of: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.365 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.514 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.394 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.208 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.403 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.488 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.320 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.549 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.613 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.424 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.673 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.771
I export this model using: python tools/converter.py -f configs/damoyolo_tinynasL25_S.py -c workdirs/damoyolo_tinynasL25_S/epoch_60_ckpt.pth --batch_size 1 --img_size 640 --end2end --ort
But when evaluating the exported model on COCO I get the following results: Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.008 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.017 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.003 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.047
I have debugged the pre-processing and post-processing steps I have added for running the ONNX model and it looks consistent with the demo script. I am using the same COCO validation scripts for YOLOX and RT-DETR models and have no issues with those. Looks to me like something must be wrong with the export script.
To Reproduce
Hyper-parameters/Configs
No response
Logs
No response
Screenshots
No response
Additional
No response