Linaom1214 / TensorRT-For-YOLO-Series

tensorrt for yolo series (YOLOv11,YOLOv10,YOLOv9,YOLOv8,YOLOv7,YOLOv6,YOLOX,YOLOv5), nms plugin support
942 stars 159 forks source link

Why is the TRT model of yolov7 not as fast as the PT model #41

Closed YFforever2022 closed 2 years ago

YFforever2022 commented 2 years ago

Do you know why it takes only 9 milliseconds to infer using Pt model, but 20 milliseconds to infer using TRT model? They have already warmed up 10 times. If so, tensorrt does not seem to accelerate. Maybe there is a configuration error

Linaom1214 commented 2 years ago

engine init finished blob image [08/22/2022-20:08:39] [W] [TRT] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1. [08/22/2022-20:08:39] [W] [TRT] The enqueue() method has been deprecated when used with engines built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. Please use enqueueV2() instead. [08/22/2022-20:08:39] [W] [TRT] Also, the batchSize argument passed into this function has no effect on changing the input shapes. Please use setBindingDimensions() function to change input shapes instead. 15ms num of boxes before nms: 62 num of boxes: 6 0 = 0.90573 at 53.16 398.86 189.65 x 500.30 5 = 0.90219 at 13.93 234.68 770.86 x 508.74 0 = 0.89119 at 220.63 412.31 128.91 x 446.84 0 = 0.88738 at 666.77 394.24 142.23 x 481.20 0 = 0.61789 at 0.00 558.58 75.63 x 327.18 11 = 0.23620 at 0.41 252.30 33.84 x 71.59 save vis file yolo destroy

yolov7-tiny.trt normal竟然比end2end更快

yolo.hpp开头需要增加#define NOMINMAX 以及代码中的363-364行改为如下 const char INPUT_BLOB_NAME = "images";//image_arrays const char OUTPUT_BLOB_NAME = "output";

还有自己新建一个dirent.h文件

看来end2end 这个代码还需要优化呀