Open bigbro13 opened 3 weeks ago
Is the speed measured in the environment of TensorRT? Why is the model so big, and what has been modified? In addition, I don't think it's appropriate to compare the 165M model to the YOLO11.
Modify the data position under configs/dataset/custom_detection.yml, with the training command CUDA_VISIBLE-DEVICES=0 torchrun -- master_port=7777-- nproc_per_node=1 train. py - c configs/dfine/custom/fine_cgnetv2/s_custom. yml -- use amp -- seed=0
It is normal for checkpoint to have 165Mb because it contains the original model weights, EMA model weights, Optimizer, etc. If you need to convert to the model size we provided, pls use :
reference/ convert_weigh.py
In addition, the YOLO11 official onnx code does not include NMS post-processing. I submitted the relevant issue before
All official speeds are based on TensorRT, like other works. The inference code we presented uses the CPU for inference by default and is not comparable~
Thank you for your kind answer. I did not compare on TensorRT, but only conducted inference testing on touch and onnx. It was my mistake
You're welcome. We provide speed comparison code in tools/benchmark/trt_benchmark.py, which can be used by any other network when converted to engine. The YOLO series suffers a significant drop in performance after adjusting NMS parameters, and if you really need to make a completely fair comparison, Please refer to https://github.com/Peterande/D-FINE/blob/master/tools/deployment/export_yolo_w_nms to export YOLO11 with NMS and set - score_threshold to 0.001
the size of the model trained on my own data is about 165M, and the inference time, including post-processing, is approximately 237ms