Peterande / D-FINE

D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement 💥💥💥
Apache License 2.0
1.1k stars 85 forks source link

Why is the inference speed of models trained on custom datasets much slower than YOLOV11 #38

Open bigbro13 opened 3 weeks ago

bigbro13 commented 3 weeks ago

the size of the model trained on my own data is about 165M, and the inference time, including post-processing, is approximately 237ms

Peterande commented 3 weeks ago

Is the speed measured in the environment of TensorRT? Why is the model so big, and what has been modified? In addition, I don't think it's appropriate to compare the 165M model to the YOLO11.

bigbro13 commented 3 weeks ago

Modify the data position under configs/dataset/custom_detection.yml, with the training command CUDA_VISIBLE-DEVICES=0 torchrun -- master_port=7777-- nproc_per_node=1 train. py - c configs/dfine/custom/fine_cgnetv2/s_custom. yml -- use amp -- seed=0 image

Peterande commented 3 weeks ago

It is normal for checkpoint to have 165Mb because it contains the original model weights, EMA model weights, Optimizer, etc. If you need to convert to the model size we provided, pls use :

reference/ convert_weigh.py

In addition, the YOLO11 official onnx code does not include NMS post-processing. I submitted the relevant issue before

All official speeds are based on TensorRT, like other works. The inference code we presented uses the CPU for inference by default and is not comparable~

bigbro13 commented 3 weeks ago

Thank you for your kind answer. I did not compare on TensorRT, but only conducted inference testing on touch and onnx. It was my mistake

Peterande commented 3 weeks ago

You're welcome. We provide speed comparison code in tools/benchmark/trt_benchmark.py, which can be used by any other network when converted to engine. The YOLO series suffers a significant drop in performance after adjusting NMS parameters, and if you really need to make a completely fair comparison, Please refer to https://github.com/Peterande/D-FINE/blob/master/tools/deployment/export_yolo_w_nms to export YOLO11 with NMS and set - score_threshold to 0.001