YOLOV4_TINY 设置的FP32，预测时间20ms，设置INT8也是20ms。

xu998 commented 3 years ago

你好，昨天调通了demo，用的 Win10 , Tensorrt-7.0.0.11，Cuda 10.2，Cudnn 7.6.5.32，GTX 1050ti 4G，yolov4-tiny

设置的FP32

pre elasped time:7.0705ms inference elasped time:6.262ms post elasped time:4.7838ms detect elasped time:21.3155ms

设置的INT8

pre elasped time:10.9021ms inference elasped time:4.9992ms post elasped time:2.9783ms detect elasped time:20.4411ms

    Config config_v4_tiny;
config_v4_tiny.net_type = YOLOV4_TINY;
config_v4_tiny.detect_thresh = 0.5;
config_v4_tiny.file_model_cfg = "d:/yolov4-tiny.cfg";
config_v4_tiny.file_model_weights = "d:/yolov4-tiny.weights";
config_v4_tiny.calibration_image_list_file_txt = "d:/calibration_images.txt";
config_v4_tiny.inference_precison = INT8;  //FP32

我看网上好多文章都是2ms。请问哪里需要设置吗，不需要设置输入尺寸吗，我的是416*416的，还有个报警WARNING: Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles 这个需要改吗？

enazoe commented 3 years ago

尺寸是从你cfg里来的，不同实现的耗时可能是不同的，还有就是batch数，有些结果是多batch下的，还有就是平台差异

xu998 commented 3 years ago

改成release变成10ms了，那其他没什么改的了，之前折腾调用原生的yolo_cpp_dll.dll预测时间是20ms，用opencv cuda预测时间是11ms。网上都说tensorrt是最快的，快很多的。。 bitch是这个吗？ YoloPluginCtx ctx = new YoloPluginCtx; ctx->initParams = initParams; ctx->batchSize = batchSize; ctx->networkInfo;// = getYoloNetworkInfo(); ctx->inferParams;// = getYoloInferParams(); uint32_t configBatchSize=1;// = getBatchSize();

xu998 commented 3 years ago

你好，是在 .cfg里改吧，INT8的从batch1改到batch8改到batch32 都是差不多10ms。我再试试升级一下各个版本。

enazoe / yolo-tensorrt

YOLOV4_TINY 设置的FP32，预测时间20ms，设置INT8也是20ms。 #107

pre elasped time:7.0705ms inference elasped time:6.262ms post elasped time:4.7838ms detect elasped time:21.3155ms

pre elasped time:10.9021ms inference elasped time:4.9992ms post elasped time:2.9783ms detect elasped time:20.4411ms