Confusion About the Speed

WongKinYiu / yolov7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

GNU General Public License v3.0

13.21k stars 4.17k forks source link

Confusion About the Speed #21

Open wangxinxin08 opened 2 years ago

wangxinxin08 commented 2 years ago

Can you give some instruction to reproduce the speed in your paper or repo?

WongKinYiu commented 2 years ago

We just follow u5 speed test protocol, and use u5 branch to calculate average time of 3 run. And the description is under the Table 2 in the paper.

wangxinxin08 commented 2 years ago

I think It is not very fair to use half precision for speed testing for other models such as PP-YOLOE. I suggest we can build a speed testing benchmark by using a third-party inference engine such as TensorRT or ONNX runtime. In this way, we can avoid the bias caused by different models using different speed measurement environments and speed measurement methods. By creating this benchmark, we can gradually establish a speed measurement standard for yolo models, which is fair and convincing. I wonder whether you are interested or not.

WongKinYiu commented 2 years ago

I agree with you, but in my previous experiments, TensorRT will reduce some layers when inference. I have implement several TensorRT inference code for YOLOv4-tiny, and find good code can run more than 80% faster than poor code. But it is welcome to compare the speed in TensorRT or ONNX runtime in same code setting. I have no free time to write TensorRT or ONNX code, if you could provide YOLOv7 code and speed testing on TensorRT or ONNX, I could add them in readme.

By the way, I think if PP-YOLOE and YOLOX are trained and tested with letterbox setting, they could run much faster in real applications. I also find if YOLOv7 is trained with mean-var normalized input, it will easier gets 0.2~0.5% higher AP.

wangxinxin08 commented 2 years ago

Maybe I do not express my meaning clearly. It is not about the speed of a model, what really makes sense is to create a benchmark rather than performance of different yolo models under this benchmark. I'm not talking about whose model is better or worse here, creating an impactful benchmark is more important than this.

WongKinYiu commented 2 years ago

Yes, I know your meaning. And what I want to say is that to build such benchmark is really hard. For example, which tool is suitable for building the benchmark, PyTorch, ONNX, or TensorRT? In Rep large kernel paper, we can find speed of large kernel in PyTorch unreasonable. In meituan/YOLOv6, we can find ReLU is suitable for TensorRT. Another example, which device is suitable for running the benchmark? As we know PP-PicoDet and PP-YOLOE are developing for different devices. So currently what we could do is to provide more support, and let people who want to use in where could choose which is suitable in their cases.

wangxinxin08 commented 2 years ago

I do not think these questions you proposed are such hard. I know your choices and fine! Thanks for your reply!

WongKinYiu commented 2 years ago

Maybe I do not express my meaning clearly too. I am interested in it, and I think if we want to build the benchmark, it should support many different tools and many api.