NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.81k stars 2.13k forks source link

int8 problem #1179

Closed QZ-cmd closed 3 years ago

QZ-cmd commented 3 years ago

I was able to convert the trt_model fp16 and the reasoning was normal, and I converted the int8’s trt_model with 1000 pictures, but the test map dropped to 0。I would like to ask where there may be a problem, how to debug。 In addition, I compared the output of fp16 and int8, and tested the same picture. Their score information was almost the same, and their location information was much worse. In addition, will your single channel gray image be affected by int8

ttyio commented 3 years ago

@QZ-cmd , FYI, we have a white paper on int8 quantization http://arxiv.org/abs/2004.09602

could you try different calibration method and calibration dataset, and maybe some tricks like replace SiLU/LeakyReLU with ReLU.

QZ-cmd commented 3 years ago

Do you know any other calibration methods? I tried 1,000 calibration datasets and 4,000 calibration data sets, and the result was a model for int8, map=0, but I also converted to fp16 for testing is normal.I tried the coco dataset, but the accuracy was lost at int8, and I suspect there was a problem with model,is not data problem.if you need onnx-model,please tell me,looking forward to your reply.

ttyio commented 3 years ago

Hello @QZ-cmd , sorry we do not have bandwidth to debug INT8 accuracy issue unless it is TRT bug.

TRT support entropy, min-max, percentile-max. And if you use nvidia pytorch-quantization (https://github.com/NVIDIA/TensorRT/tree/release/7.2/tools/pytorch-quantization), there are more calibration methods include QAT, The white paper http://arxiv.org/abs/2004.09602 has some sample networks and int8 receipts.

QZ-cmd commented 3 years ago

It is strange that FP16-trt is normal, int8-trt conversion process did not report errors, but the accuracy of all disappeared, this may be where the problem? this is int8 calibration excel. [TensorRT] VERBOSE: Fastest Tactic: 256 Time: 0.050816 [TensorRT] VERBOSE: --------------- Timing Runner: LeakyRelu_64 (PointWiseV2) [TensorRT] VERBOSE: Tactic: 10 time 0.015008 [TensorRT] VERBOSE: Tactic: 11 time 0.014848 [TensorRT] VERBOSE: Tactic: 12 time 0.01648 [TensorRT] VERBOSE: Tactic: 13 time 0.015008 [TensorRT] VERBOSE: Tactic: 14 time 0.017344 [TensorRT] VERBOSE: Tactic: 15 time 0.020224 [TensorRT] VERBOSE: Tactic: 16 time 0.020096 [TensorRT] VERBOSE: Tactic: 17 time 0.020928 [TensorRT] VERBOSE: Tactic: 18 time 0.026208 [TensorRT] VERBOSE: Tactic: 19 time 0.027744 [TensorRT] VERBOSE: Tactic: 20 time 0.015968 [TensorRT] VERBOSE: Tactic: 21 time 0.0136 [TensorRT] VERBOSE: Tactic: 22 time 0.014816 [TensorRT] VERBOSE: Tactic: 23 time 0.014336 [TensorRT] VERBOSE: Tactic: 24 time 0.019104 [TensorRT] VERBOSE: Tactic: 25 time 0.016832 [TensorRT] VERBOSE: Tactic: 26 time 0.01456 [TensorRT] VERBOSE: Tactic: 27 time 0.01376 [TensorRT] VERBOSE: Fastest Tactic: 21 Time: 0.0136 [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: PointWiseV2 Tactic: 21 [TensorRT] VERBOSE: [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.064 [TensorRT] VERBOSE: Tactic: 0 time 0.11328 [TensorRT] VERBOSE: Fastest Tactic: 1002 Time: 0.064 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.0792 [TensorRT] VERBOSE: Tactic: 0 time 0.02288 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.02288 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.077472 [TensorRT] VERBOSE: Tactic: 0 time 0.079904 [TensorRT] VERBOSE: Fastest Tactic: 1002 Time: 0.077472 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.110912 [TensorRT] VERBOSE: Tactic: 0 time 0.0464 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.0464 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.077856 [TensorRT] VERBOSE: Tactic: 0 time 0.047456 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.047456 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.076224 [TensorRT] VERBOSE: Tactic: 0 time 0.095232 [TensorRT] VERBOSE: Fastest Tactic: 1002 Time: 0.076224 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.0768 [TensorRT] VERBOSE: Tactic: 0 time 0.041056 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.041056 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.078848 [TensorRT] VERBOSE: Tactic: 0 time 0.115392 [TensorRT] VERBOSE: Fastest Tactic: 1002 Time: 0.078848 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.072512 [TensorRT] VERBOSE: Tactic: 0 time 0.015488 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.015488 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.07616 [TensorRT] VERBOSE: Tactic: 0 time 0.04096 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.04096 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.074592 [TensorRT] VERBOSE: Tactic: 0 time 0.114688 [TensorRT] VERBOSE: Fastest Tactic: 1002 Time: 0.074592 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.07408 [TensorRT] VERBOSE: Tactic: 0 time 0.033376 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.033376 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.096608 [TensorRT] VERBOSE: Tactic: 0 time 0.028128 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.028128 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.095008 [TensorRT] VERBOSE: Tactic: 0 time 0.095808 [TensorRT] VERBOSE: Fastest Tactic: 1002 Time: 0.095008 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.13552 [TensorRT] VERBOSE: Tactic: 0 time 0.056032 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.056032 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.096704 [TensorRT] VERBOSE: Tactic: 0 time 0.057504 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.057504 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.09472 [TensorRT] VERBOSE: Tactic: 0 time 0.11776 [TensorRT] VERBOSE: Fastest Tactic: 1002 Time: 0.09472 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.094336 [TensorRT] VERBOSE: Tactic: 0 time 0.037888 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.037888 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.08816 [TensorRT] VERBOSE: Tactic: 0 time 0.018528 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.018528 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.096928 [TensorRT] VERBOSE: Tactic: 0 time 0.04896 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.04896 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.090624 [TensorRT] VERBOSE: Tactic: 0 time 0.040512 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.040512 [TensorRT] VERBOSE: Autotuning format combination: Float(1,13,169,216320) -> Float(1,13,169,173056) [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn_winograd) Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (FusedConvActConvolution) [TensorRT] VERBOSE: Tactic: 524287 time 8.21203 [TensorRT] VERBOSE: Tactic: 720895 time 8.35267 [TensorRT] VERBOSE: Tactic: 983039 time 8.07411 [TensorRT] VERBOSE: Tactic: 1048575 time 7.88598 [TensorRT] VERBOSE: Tactic: 1703935 time 7.6087 [TensorRT] VERBOSE: Tactic: 1769471 time 7.80611 [TensorRT] VERBOSE: Tactic: 1966079 time 8.91517 [TensorRT] VERBOSE: Tactic: 2031615 time 7.31853 [TensorRT] VERBOSE: Tactic: 2228223 time 8.73885 [TensorRT] VERBOSE: Tactic: 2424831 time 8.67738 [TensorRT] VERBOSE: Tactic: 2621439 time 7.72173 [TensorRT] VERBOSE: Tactic: 2752511 time 6.31494 [TensorRT] VERBOSE: Tactic: 2818047 time 6.86765 [TensorRT] VERBOSE: Tactic: 2883583 time 8.10659 [TensorRT] VERBOSE: Tactic: 3014655 time 6.98579 [TensorRT] VERBOSE: Tactic: 3145727 time 6.25549 [TensorRT] VERBOSE: Tactic: 3473407 time 10.9961 [TensorRT] VERBOSE: Tactic: 3604479 time 6.95738 [TensorRT] VERBOSE: Tactic: 3735551 time 16.5442 [TensorRT] VERBOSE: Tactic: 4390911 time 7.97654 [TensorRT] VERBOSE: Tactic: 5046271 time 9.43827 [TensorRT] VERBOSE: Tactic: 5963775 time 8.06637 [TensorRT] VERBOSE: Tactic: 6160383 time 9.1473 [TensorRT] VERBOSE: Tactic: 6488063 time 6.69763 [TensorRT] VERBOSE: Tactic: 6881279 time 8.00381 [TensorRT] VERBOSE: Tactic: 7274495 time 10.2698 [TensorRT] VERBOSE: Tactic: 7864319 time 7.94454 [TensorRT] VERBOSE: Tactic: 7995391 time 8.62278 [TensorRT] VERBOSE: Tactic: 8585215 time 7.37648 [TensorRT] VERBOSE: Tactic: 8847359 time 7.21494 [TensorRT] VERBOSE: Tactic: 8978431 time 7.99568 [TensorRT] VERBOSE: Tactic: 9043967 time 6.18906 [TensorRT] VERBOSE: Tactic: 9175039 time 6.95706 [TensorRT] VERBOSE: Tactic: 9502719 time 6.14845 [TensorRT] VERBOSE: Tactic: 9830399 time 9.90061 [TensorRT] VERBOSE: Tactic: 9961471 time 7.73808 [TensorRT] VERBOSE: Tactic: 10027007 time 6.79363 [TensorRT] VERBOSE: Tactic: 10092543 time 7.97235 [TensorRT] VERBOSE: Tactic: 10289151 time 8.77709 [TensorRT] VERBOSE: Tactic: 10485759 time 6.80243 [TensorRT] VERBOSE: Tactic: 10682367 time 7.45872 [TensorRT] VERBOSE: Tactic: 10813439 time 8.17254 [TensorRT] VERBOSE: Fastest Tactic: 9502719 Time: 6.14845 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CaskConvolution) [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 1825138533642645384 time 8.17616 [TensorRT] VERBOSE: Conv_142 (scudnn_winograd) Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TensorRT] VERBOSE: Tactic: 2775507031594384867 time 4.01469 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Tactic: 2842488832350522458 time 8.17392 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: 3915320020053085238 time 8.0695 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Tactic: 6448355332020552203 time 8.37219 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: 6808617066150061604 time 8.33306 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: -8060443123034038864 time 8.50982 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: -4420849921117327522 time 9.28768 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: -3946921629105938337 time 9.3856 [TensorRT] VERBOSE: Fastest Tactic: 2775507031594384867 Time: 4.01469 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaConvolution) [TensorRT] VERBOSE: Tactic: 0 time 9.32659 [TensorRT] VERBOSE: Tactic: 2 time 11.3032 [TensorRT] VERBOSE: Tactic: 4 skipped. Scratch requested: 3070033920, available: 1073741824 [TensorRT] VERBOSE: Tactic: 5 skipped. Scratch requested: 5714280448, available: 1073741824 [TensorRT] VERBOSE: Tactic: 57 time 9.57235 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 9.32659 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaDepthwiseConvolution) [TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867 [TensorRT] VERBOSE: Conv_142 (scudnn_winograd) Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TensorRT] VERBOSE: [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn_winograd) Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (scudnn_winograd) Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 [TensorRT] VERBOSE: Autotuning format combination: Int8(1,13,169:4,54080) -> Float(1,13,169,173056) [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (FusedConvActConvolution) [TensorRT] VERBOSE: FusedConvActConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CaskConvolution) [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 892787096507693407 time 2.0616 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 1204440019753223942 time 2.1176 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: 1659301557717208403 time 2.8672 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 2057291331119027912 time 2.13664 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_small_nn_v1 [TensorRT] VERBOSE: Tactic: 3275977259705528576 time 2.70166 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 5623454780463195174 time 2.42278 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Tactic: 8930254200803946944 time 2.06192 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: -9204333525109552344 time 2.10125 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: -4973811344878172338 time 2.03216 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Tactic: -1228371230285617088 time 2.1176 [TensorRT] VERBOSE: Fastest Tactic: -4973811344878172338 Time: 2.03216 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaConvolution) [TensorRT] VERBOSE: CudaConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaDepthwiseConvolution) [TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaGroupConvolution) [TensorRT] VERBOSE: CudaGroupConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -4973811344878172338 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Autotuning format combination: Int8(1,13,169:4,54080) -> Int8(1,13,169:4,43264) [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_xregs_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_xregs_small_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (FusedConvActConvolution) [TensorRT] VERBOSE: Tactic: 524287 time 2.28675 [TensorRT] VERBOSE: Tactic: 720895 time 2.63053 [TensorRT] VERBOSE: Tactic: 983039 time 1.96902 [TensorRT] VERBOSE: Tactic: 1048575 time 2.51779 [TensorRT] VERBOSE: Tactic: 1703935 time 2.12582 [TensorRT] VERBOSE: Tactic: 1769471 time 2.07891 [TensorRT] VERBOSE: Tactic: 1966079 time 2.26182 [TensorRT] VERBOSE: Tactic: 2031615 time 2.06669 [TensorRT] VERBOSE: Tactic: 2228223 time 3.5631 [TensorRT] VERBOSE: Tactic: 2424831 time 3.64112 [TensorRT] VERBOSE: Tactic: 2621439 time 2.20058 [TensorRT] VERBOSE: Tactic: 2752511 time 1.92157 [TensorRT] VERBOSE: Tactic: 2818047 time 1.79798 [TensorRT] VERBOSE: Tactic: 2883583 time 2.11811 [TensorRT] VERBOSE: Tactic: 3014655 time 2.17907 [TensorRT] VERBOSE: Tactic: 3145727 time 1.88294 [TensorRT] VERBOSE: Tactic: 3473407 time 2.8327 [TensorRT] VERBOSE: Tactic: 3604479 time 1.78115 [TensorRT] VERBOSE: Tactic: 3735551 time 2.46413 [TensorRT] VERBOSE: Tactic: 4390911 time 1.96336 [TensorRT] VERBOSE: Tactic: 5046271 time 2.60467 [TensorRT] VERBOSE: Tactic: 5963775 time 2.06582 [TensorRT] VERBOSE: Tactic: 6160383 time 4.17792 [TensorRT] VERBOSE: Tactic: 6488063 time 1.85651 [TensorRT] VERBOSE: Tactic: 6881279 time 1.90464 [TensorRT] VERBOSE: Tactic: 7274495 time 2.48013 [TensorRT] VERBOSE: Tactic: 7864319 time 2.25635 [TensorRT] VERBOSE: Tactic: 7995391 time 2.25808 [TensorRT] VERBOSE: Tactic: 8585215 time 1.90499 [TensorRT] VERBOSE: Tactic: 8847359 time 1.76819 [TensorRT] VERBOSE: Tactic: 8978431 time 1.99984 [TensorRT] VERBOSE: Tactic: 9043967 time 1.76419 [TensorRT] VERBOSE: Tactic: 9175039 time 1.78746 [TensorRT] VERBOSE: Tactic: 9502719 time 2.02454 [TensorRT] VERBOSE: Tactic: 9830399 time 3.0089 [TensorRT] VERBOSE: Tactic: 9961471 time 2.71174 [TensorRT] VERBOSE: Tactic: 10027007 time 2.19536 [TensorRT] VERBOSE: Tactic: 10092543 time 1.96445 [TensorRT] VERBOSE: Tactic: 10289151 time 2.26595 [TensorRT] VERBOSE: Tactic: 10485759 time 2.2217 [TensorRT] VERBOSE: Tactic: 10682367 time 1.944 [TensorRT] VERBOSE: Tactic: 10813439 time 2.18307 [TensorRT] VERBOSE: Fastest Tactic: 9043967 Time: 1.76419 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CaskConvolution) [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 4438325421691896755 time 2.14038 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: 4581732244273465060 time 2.09562 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_xregs_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 4934335053031119367 time 2.10458 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 6797040896965118050 time 2.05517 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 8006952294591770973 time 3.00544 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: -7210942453088153035 time 2.02858 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: -6282183216199417697 time 2.55347 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Tactic: -5026383765466876607 time 2.10774 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_xregs_small_nn_v1 [TensorRT] VERBOSE: Tactic: -5016725782072253841 time 2.11539 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_xregs_large_nn_v1 [TensorRT] VERBOSE: Tactic: -1370999262391786833 time 2.056 [TensorRT] VERBOSE: Fastest Tactic: -7210942453088153035 Time: 2.02858 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaConvolution) [TensorRT] VERBOSE: CudaConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaDepthwiseConvolution) [TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaGroupConvolution) [TensorRT] VERBOSE: CudaGroupConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: FusedConvActConvolution Tactic: 9043967 [TensorRT] VERBOSE: [TensorRT] VERBOSE: Autotuning format combination: Int8(1,13,169:4,54080) -> Int8(1,13,169:32,5408) [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_xregs_medium_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_medium_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_xregs_large_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_small_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_small_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_xregs_small_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_small_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_medium_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_medium_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_xregs_large_c32_nn_v1 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (FusedConvActConvolution) [TensorRT] VERBOSE: FusedConvActConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CaskConvolution) [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_xregs_medium_c32_nn_v1 [TensorRT] VERBOSE: Tactic: 1213457772632185722 time 2.11738 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_medium_c32_nn_v1 [TensorRT] VERBOSE: Tactic: 1713441381477652893 time 2.6665 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_xregs_large_c32_nn_v1 [TensorRT] VERBOSE: Tactic: 7125598890155666458 time 2.05414 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_small_c32_nn_v1 [TensorRT] VERBOSE: Tactic: 8047041638267142825 time 2.54074 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_small_c32_nn_v1 [TensorRT] VERBOSE: Tactic: -7846982807478255793 time 2.09584 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_xregs_small_c32_nn_v1 [TensorRT] VERBOSE: Tactic: -6459719113600909000 time 2.09814 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_small_c32_nn_v1 [TensorRT] VERBOSE: Tactic: -4573925292554651334 time 2.02963 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_medium_c32_nn_v1 [TensorRT] VERBOSE: Tactic: -3566249366964946311 time 2.13302 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_medium_c32_nn_v1 [TensorRT] VERBOSE: Tactic: -2002418013575043687 time 2.05629 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_xregs_large_c32_nn_v1 [TensorRT] VERBOSE: Tactic: -1659631603542281459 time 2.10893 [TensorRT] VERBOSE: Fastest Tactic: -4573925292554651334 Time: 2.02963 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaConvolution) [TensorRT] VERBOSE: CudaConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaDepthwiseConvolution) [TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaGroupConvolution) [TensorRT] VERBOSE: CudaGroupConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -4573925292554651334 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_small_c32_nn_v1 [TensorRT] VERBOSE: [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_xregs_medium_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_medium_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_xregs_large_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_small_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_small_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_xregs_small_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_small_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x64_relu_medium_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_medium_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_xregs_large_c32_nn_v1 [TensorRT] VERBOSE: Conv_142 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x128_relu_small_c32_nn_v1 [TensorRT] VERBOSE: Autotuning format combination: Int8(1,13,169:32,6760) -> Float(1,13,169:32,5408) [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (FusedConvActConvolution) [TensorRT] VERBOSE: FusedConvActConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CaskConvolution) [TensorRT] VERBOSE: CaskConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaConvolution) [TensorRT] VERBOSE: CudaConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaDepthwiseConvolution) [TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaGroupConvolution) [TensorRT] VERBOSE: CudaGroupConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: Autotuning format combination: Int8(1,13,169:32,6760) -> Int8(1,13,169:32,5408) [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_medium_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_large_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x128_ldg16_relu_medium_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_medium_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_large_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x128_ldg16_relu_large_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_large_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_medium_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (FusedConvActConvolution) [TensorRT] VERBOSE: Tactic: 524287 time 2.2296 [TensorRT] VERBOSE: Tactic: 720895 time 2.1393 [TensorRT] VERBOSE: Tactic: 983039 time 1.94746 [TensorRT] VERBOSE: Tactic: 1048575 time 2.21962 [TensorRT] VERBOSE: Tactic: 1703935 time 2.09446 [TensorRT] VERBOSE: Tactic: 1769471 time 2.03808 [TensorRT] VERBOSE: Tactic: 1966079 time 2.18246 [TensorRT] VERBOSE: Tactic: 2031615 time 1.88253 [TensorRT] VERBOSE: Tactic: 2228223 time 4.01728 [TensorRT] VERBOSE: Tactic: 2424831 time 3.992 [TensorRT] VERBOSE: Tactic: 2621439 time 1.98394 [TensorRT] VERBOSE: Tactic: 2752511 time 1.93456 [TensorRT] VERBOSE: Tactic: 2818047 time 1.79536 [TensorRT] VERBOSE: Tactic: 2883583 time 2.10435 [TensorRT] VERBOSE: Tactic: 3014655 time 2.17984 [TensorRT] VERBOSE: Tactic: 3145727 time 1.84214 [TensorRT] VERBOSE: Tactic: 3473407 time 2.79248 [TensorRT] VERBOSE: Tactic: 3604479 time 1.79018 [TensorRT] VERBOSE: Tactic: 3735551 time 2.77914 [TensorRT] VERBOSE: Tactic: 4390911 time 1.97398 [TensorRT] VERBOSE: Tactic: 5046271 time 2.66598 [TensorRT] VERBOSE: Tactic: 5963775 time 2.12426 [TensorRT] VERBOSE: Tactic: 6160383 time 4.31389 [TensorRT] VERBOSE: Tactic: 6488063 time 1.74528 [TensorRT] VERBOSE: Tactic: 6881279 time 1.93171 [TensorRT] VERBOSE: Tactic: 7274495 time 2.46541 [TensorRT] VERBOSE: Tactic: 7864319 time 2.26541 [TensorRT] VERBOSE: Tactic: 7995391 time 1.99203 [TensorRT] VERBOSE: Tactic: 8585215 time 1.94669 [TensorRT] VERBOSE: Tactic: 8847359 time 1.74029 [TensorRT] VERBOSE: Tactic: 8978431 time 2.03286 [TensorRT] VERBOSE: Tactic: 9043967 time 1.7425 [TensorRT] VERBOSE: Tactic: 9175039 time 1.78614 [TensorRT] VERBOSE: Tactic: 9502719 time 2.19946 [TensorRT] VERBOSE: Tactic: 9830399 time 1.83949 [TensorRT] VERBOSE: Tactic: 9961471 time 3.27987 [TensorRT] VERBOSE: Tactic: 10027007 time 2.20368 [TensorRT] VERBOSE: Tactic: 10092543 time 2.16467 [TensorRT] VERBOSE: Tactic: 10289151 time 2.18902 [TensorRT] VERBOSE: Tactic: 10485759 time 2.24282 [TensorRT] VERBOSE: Tactic: 10682367 time 1.95334 [TensorRT] VERBOSE: Tactic: 10813439 time 2.03197 [TensorRT] VERBOSE: Fastest Tactic: 8847359 Time: 1.74029 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CaskConvolution) [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_medium_nt_v1 [TensorRT] VERBOSE: Tactic: 66319348402778770 time 0.647168 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_large_nt_v1 [TensorRT] VERBOSE: Tactic: 1931698692231796048 time 0.705184 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x128_ldg16_relu_medium_nt_v1 [TensorRT] VERBOSE: Tactic: 2271687430539765460 time 0.76544 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Tactic: 7039764449991095921 time 0.63936 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_medium_nt_v1 [TensorRT] VERBOSE: Tactic: -9114895246540757312 time 0.652256 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_large_nt_v1 [TensorRT] VERBOSE: Tactic: -8787970778927801941 time 0.715104 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x128_ldg16_relu_large_nt_v1 [TensorRT] VERBOSE: Tactic: -8225786209923559953 time 0.8664 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_large_nt_v1 [TensorRT] VERBOSE: Tactic: -7373087278866484214 time 0.684096 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_medium_nt_v1 [TensorRT] VERBOSE: Tactic: -7274936339335021260 time 0.74448 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Tactic: -2102888629196925141 time 0.644192 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Tactic: -674235064782459186 time 0.790528 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Tactic: -182858804213663094 time 0.652352 [TensorRT] VERBOSE: Fastest Tactic: 7039764449991095921 Time: 0.63936 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaConvolution) [TensorRT] VERBOSE: CudaConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaDepthwiseConvolution) [TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_142 (CudaGroupConvolution) [TensorRT] VERBOSE: CudaGroupConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 7039764449991095921 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_medium_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_large_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x128_ldg16_relu_medium_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_medium_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_large_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x128_ldg16_relu_large_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_large_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_medium_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Autotuning format combination: Float(1,13,169,173056) -> Float(1,13,169,173056) [TensorRT] VERBOSE: --------------- Timing Runner: LeakyRelu_144 (PointWise) [TensorRT] VERBOSE: Tactic: 128 time 0.048128 [TensorRT] VERBOSE: Tactic: 256 time 0.048224 [TensorRT] VERBOSE: Tactic: 512 time 0.050016 [TensorRT] VERBOSE: Tactic: -32 time 0.065664 [TensorRT] VERBOSE: Tactic: -64 time 0.060544 [TensorRT] VERBOSE: Tactic: -128 time 0.057248 [TensorRT] VERBOSE: Fastest Tactic: 128 Time: 0.048128 [TensorRT] VERBOSE: --------------- Timing Runner: LeakyRelu_144 (PointWiseV2) [TensorRT] VERBOSE: Tactic: 0 time 0.035168 [TensorRT] VERBOSE: Tactic: 1 time 0.033728 [TensorRT] VERBOSE: Tactic: 2 time 0.030464 [TensorRT] VERBOSE: Tactic: 3 time 0.034496 [TensorRT] VERBOSE: Tactic: 4 time 0.032192 [TensorRT] VERBOSE: Tactic: 5 time 0.031104 [TensorRT] VERBOSE: Tactic: 6 time 0.035296 [TensorRT] VERBOSE: Tactic: 7 time 0.03072 [TensorRT] VERBOSE: Tactic: 8 time 0.031648 [TensorRT] VERBOSE: Tactic: 9 time 0.032608 [TensorRT] VERBOSE: Fastest Tactic: 2 Time: 0.030464 [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: PointWiseV2 Tactic: 2 [TensorRT] VERBOSE: [TensorRT] VERBOSE: Autotuning format combination: Float(1,13,169:32,5408) -> Float(1,13,169:32,5408) [TensorRT] VERBOSE: --------------- Timing Runner: LeakyRelu_144 (PointWise) [TensorRT] VERBOSE: Tactic: 128 time 0.047968 [TensorRT] VERBOSE: Tactic: 256 time 0.047488 [TensorRT] VERBOSE: Tactic: 512 time 0.049152 [TensorRT] VERBOSE: Tactic: -32 time 0.066432 [TensorRT] VERBOSE: Tactic: -64 time 0.061184 [TensorRT] VERBOSE: Tactic: -128 time 0.057344 [TensorRT] VERBOSE: Fastest Tactic: 256 Time: 0.047488 [TensorRT] VERBOSE: --------------- Timing Runner: LeakyRelu_144 (PointWiseV2) [TensorRT] VERBOSE: Tactic: 24 time 0.049984 [TensorRT] VERBOSE: Tactic: 25 time 0.045728 [TensorRT] VERBOSE: Tactic: 26 time 0.046656 [TensorRT] VERBOSE: Tactic: 27 time 0.040448 [TensorRT] VERBOSE: Fastest Tactic: 27 Time: 0.040448 [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: PointWiseV2 Tactic: 27 [TensorRT] VERBOSE: [TensorRT] VERBOSE: Autotuning format combination: Int8(1,13,169:4,43264) -> Int8(1,13,169:4,43264) [TensorRT] VERBOSE: --------------- Timing Runner: LeakyRelu_144 (PointWise) [TensorRT] VERBOSE: Tactic: 128 time 0.059392 [TensorRT] VERBOSE: Tactic: 256 time 0.050752 [TensorRT] VERBOSE: Tactic: 512 time 0.0512 [TensorRT] VERBOSE: Tactic: -32 time 0.078624 [TensorRT] VERBOSE: Tactic: -64 time 0.072256 [TensorRT] VERBOSE: Tactic: -128 time 0.082336 [TensorRT] VERBOSE: Fastest Tactic: 256 Time: 0.050752 [TensorRT] VERBOSE: --------------- Timing Runner: LeakyRelu_144 (PointWiseV2) [TensorRT] VERBOSE: Tactic: 0 time 0.020576 [TensorRT] VERBOSE: Tactic: 1 time 0.016768 [TensorRT] VERBOSE: Tactic: 2 time 0.017408 [TensorRT] VERBOSE: Tactic: 3 time 0.017248 [TensorRT] VERBOSE: Tactic: 4 time 0.018336 [TensorRT] VERBOSE: Tactic: 5 time 0.019456 [TensorRT] VERBOSE: Tactic: 6 time 0.014496 [TensorRT] VERBOSE: Tactic: 7 time 0.013728 [TensorRT] VERBOSE: Tactic: 8 time 0.016736 [TensorRT] VERBOSE: Tactic: 9 time 0.023168 [TensorRT] VERBOSE: Tactic: 10 time 0.030496 [TensorRT] VERBOSE: Tactic: 11 time 0.024608 [TensorRT] VERBOSE: Tactic: 12 time 0.021984 [TensorRT] VERBOSE: Tactic: 13 time 0.020672 [TensorRT] VERBOSE: Tactic: 14 time 0.018688 [TensorRT] VERBOSE: Tactic: 15 time 0.019904 [TensorRT] VERBOSE: Tactic: 16 time 0.021184 [TensorRT] VERBOSE: Tactic: 17 time 0.019104 [TensorRT] VERBOSE: Tactic: 18 time 0.022272 [TensorRT] VERBOSE: Tactic: 19 time 0.026144 [TensorRT] VERBOSE: Tactic: 20 time 0.049056 [TensorRT] VERBOSE: Tactic: 21 time 0.034848 [TensorRT] VERBOSE: Tactic: 22 time 0.027648 [TensorRT] VERBOSE: Tactic: 23 time 0.021504 [TensorRT] VERBOSE: Fastest Tactic: 7 Time: 0.013728 [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: PointWiseV2 Tactic: 7 [TensorRT] VERBOSE: [TensorRT] VERBOSE: Autotuning format combination: Int8(1,13,169:32,5408) -> Int8(1,13,169:32,5408) [TensorRT] VERBOSE: --------------- Timing Runner: LeakyRelu_144 (PointWise) [TensorRT] VERBOSE: Tactic: 128 time 0.06064 [TensorRT] VERBOSE: Tactic: 256 time 0.050784 [TensorRT] VERBOSE: Tactic: 512 time 0.050784 [TensorRT] VERBOSE: Fastest Tactic: 256 Time: 0.050784 [TensorRT] VERBOSE: --------------- Timing Runner: LeakyRelu_144 (PointWiseV2) [TensorRT] VERBOSE: Tactic: 10 time 0.015136 [TensorRT] VERBOSE: Tactic: 11 time 0.014944 [TensorRT] VERBOSE: Tactic: 12 time 0.016512 [TensorRT] VERBOSE: Tactic: 13 time 0.014688 [TensorRT] VERBOSE: Tactic: 14 time 0.017408 [TensorRT] VERBOSE: Tactic: 15 time 0.02016 [TensorRT] VERBOSE: Tactic: 16 time 0.02032 [TensorRT] VERBOSE: Tactic: 17 time 0.02096 [TensorRT] VERBOSE: Tactic: 18 time 0.02656 [TensorRT] VERBOSE: Tactic: 19 time 0.028896 [TensorRT] VERBOSE: Tactic: 20 time 0.01616 [TensorRT] VERBOSE: Tactic: 21 time 0.013568 [TensorRT] VERBOSE: Tactic: 22 time 0.014336 [TensorRT] VERBOSE: Tactic: 23 time 0.015744 [TensorRT] VERBOSE: Tactic: 24 time 0.019104 [TensorRT] VERBOSE: Tactic: 25 time 0.016512 [TensorRT] VERBOSE: Tactic: 26 time 0.014912 [TensorRT] VERBOSE: Tactic: 27 time 0.01392 [TensorRT] VERBOSE: Fastest Tactic: 21 Time: 0.013568 [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: PointWiseV2 Tactic: 21 [TensorRT] VERBOSE: [TensorRT] VERBOSE: Autotuning format combination: Float(1,13,169,173056) -> Float(1,13,169,5070) [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_interior_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_interior_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_interior_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (FusedConvActConvolution) [TensorRT] VERBOSE: Tactic: 589823 time 0.716544 [TensorRT] VERBOSE: Tactic: 786431 time 0.132704 [TensorRT] VERBOSE: Tactic: 1179647 time 0.078784 [TensorRT] VERBOSE: Tactic: 1310719 time 0.352256 [TensorRT] VERBOSE: Tactic: 1376255 time 0.457248 [TensorRT] VERBOSE: Tactic: 1441791 time 0.14832 [TensorRT] VERBOSE: Tactic: 1638399 time 0.102592 [TensorRT] VERBOSE: Tactic: 1835007 time 0.128 [TensorRT] VERBOSE: Tactic: 2097151 time 0.0512 [TensorRT] VERBOSE: Tactic: 2162687 time 0.457728 [TensorRT] VERBOSE: Tactic: 2293759 time 0.437248 [TensorRT] VERBOSE: Tactic: 2359295 time 0.242528 [TensorRT] VERBOSE: Tactic: 2686975 time 0.434176 [TensorRT] VERBOSE: Tactic: 3407871 time 0.220288 [TensorRT] VERBOSE: Tactic: 3538943 time 0.117472 [TensorRT] VERBOSE: Tactic: 3997695 time 0.073888 [TensorRT] VERBOSE: Tactic: 4194303 time 0.137216 [TensorRT] VERBOSE: Tactic: 4259839 time 0.05632 [TensorRT] VERBOSE: Tactic: 4325375 time 0.114304 [TensorRT] VERBOSE: Tactic: 4521983 time 0.408416 [TensorRT] VERBOSE: Tactic: 4587519 time 0.07488 [TensorRT] VERBOSE: Tactic: 4653055 time 0.085376 [TensorRT] VERBOSE: Tactic: 4915199 time 0.076864 [TensorRT] VERBOSE: Tactic: 4980735 time 0.210752 [TensorRT] VERBOSE: Tactic: 5177343 time 0.078848 [TensorRT] VERBOSE: Tactic: 5242879 time 0.211136 [TensorRT] VERBOSE: Tactic: 5373951 time 0.07904 [TensorRT] VERBOSE: Tactic: 5439487 time 0.135168 [TensorRT] VERBOSE: Tactic: 5701631 time 0.427008 [TensorRT] VERBOSE: Tactic: 5767167 time 0.213184 [TensorRT] VERBOSE: Tactic: 5832703 time 0.224768 [TensorRT] VERBOSE: Tactic: 5898239 time 0.084896 [TensorRT] VERBOSE: Tactic: 6029311 time 0.413472 [TensorRT] VERBOSE: Tactic: 6225919 time 0.112416 [TensorRT] VERBOSE: Tactic: 6291455 time 0.078848 [TensorRT] VERBOSE: Tactic: 6750207 time 0.07648 [TensorRT] VERBOSE: Tactic: 6815743 time 0.210944 [TensorRT] VERBOSE: Tactic: 6946815 time 0.125824 [TensorRT] VERBOSE: Tactic: 7012351 time 0.0512 [TensorRT] VERBOSE: Tactic: 7077887 time 0.120128 [TensorRT] VERBOSE: Tactic: 7143423 time 0.210912 [TensorRT] VERBOSE: Tactic: 7208959 time 0.226848 [TensorRT] VERBOSE: Tactic: 7340031 time 0.085696 [TensorRT] VERBOSE: Tactic: 7405567 time 0.109984 [TensorRT] VERBOSE: Tactic: 7536639 time 0.364448 [TensorRT] VERBOSE: Tactic: 7602175 time 0.12368 [TensorRT] VERBOSE: Tactic: 7733247 time 0.130144 [TensorRT] VERBOSE: Tactic: 7798783 time 0.13392 [TensorRT] VERBOSE: Tactic: 8191999 time 0.117856 [TensorRT] VERBOSE: Tactic: 8257535 time 0.077856 [TensorRT] VERBOSE: Tactic: 8323071 time 0.136896 [TensorRT] VERBOSE: Tactic: 8650751 time 0.125696 [TensorRT] VERBOSE: Tactic: 8716287 time 0.119936 [TensorRT] VERBOSE: Tactic: 9109503 time 0.053888 [TensorRT] VERBOSE: Tactic: 9568255 time 0.07696 [TensorRT] VERBOSE: Tactic: 9895935 time 0.137984 [TensorRT] VERBOSE: Tactic: 10223615 time 0.43408 [TensorRT] VERBOSE: Tactic: 10354687 time 0.07168 [TensorRT] VERBOSE: Tactic: 10551295 time 0.234176 [TensorRT] VERBOSE: Tactic: 10747903 time 0.125824 [TensorRT] VERBOSE: Tactic: 10944511 time 0.211328 [TensorRT] VERBOSE: Fastest Tactic: 2097151 Time: 0.0512 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CaskConvolution) [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_interior_nn_v1 [TensorRT] VERBOSE: Tactic: 1754569683116234317 time 0.252096 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 1825138533642645384 time 0.2544 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_interior_nn_v1 [TensorRT] VERBOSE: Tactic: 2733356012094739613 time 0.273664 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: 3915320020053085238 time 0.253248 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: 6808617066150061604 time 0.163168 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_interior_nn_v1 [TensorRT] VERBOSE: Tactic: 9091006216302412844 time 0.155424 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: -8060443123034038864 time 0.175072 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: -4420849921117327522 time 0.238752 [TensorRT] VERBOSE: Conv_147 || Conv_145 (scudnn) Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: -3946921629105938337 time 0.279264 [TensorRT] VERBOSE: Fastest Tactic: 9091006216302412844 Time: 0.155424 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CudaConvolution) [TensorRT] VERBOSE: Tactic: 0 time 0.136608 [TensorRT] VERBOSE: Tactic: 2 time 0.331296 [TensorRT] VERBOSE: Tactic: 4 time 1.97347 [TensorRT] VERBOSE: Tactic: 5 time 0.520544 [TensorRT] VERBOSE: Tactic: 57 time 0.134624 [TensorRT] VERBOSE: Fastest Tactic: 57 Time: 0.134624 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CudaDepthwiseConvolution) [TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: FusedConvActConvolution Tactic: 2097151 [TensorRT] VERBOSE: [TensorRT] VERBOSE: Autotuning format combination: Int8(1,13,169:4,43264) -> Float(1,13,169,5070) [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_medium_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_small_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_interior_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_interior_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_interior_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_interior_nn_v1 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (FusedConvActConvolution) [TensorRT] VERBOSE: FusedConvActConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CaskConvolution) [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 892787096507693407 time 0.073728 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 1204440019753223942 time 0.050176 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: 1659301557717208403 time 0.050656 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 2057291331119027912 time 0.048544 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_small_nn_v1 [TensorRT] VERBOSE: Tactic: 3275977259705528576 time 0.046464 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Tactic: 5623454780463195174 time 0.056352 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: -9204333525109552344 time 0.046688 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_interior_nn_v1 [TensorRT] VERBOSE: Tactic: -7924103240988931433 time 0.044736 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_interior_nn_v1 [TensorRT] VERBOSE: Tactic: -7489650117016530013 time 0.046656 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Tactic: -4973811344878172338 time 0.071936 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_interior_nn_v1 [TensorRT] VERBOSE: Tactic: -3908975881807046106 time 0.050496 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_interior_nn_v1 [TensorRT] VERBOSE: Tactic: -1765942417666394360 time 0.072384 [TensorRT] VERBOSE: Fastest Tactic: -7924103240988931433 Time: 0.044736 [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CudaConvolution) [TensorRT] VERBOSE: CudaConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CudaDepthwiseConvolution) [TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CudaGroupConvolution) [TensorRT] VERBOSE: CudaGroupConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -7924103240988931433 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_interior_nn_v1 [TensorRT] VERBOSE: [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_medium_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_small_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_medium_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_interior_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_xregs_interior_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x32_relu_interior_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x128_relu_interior_nn_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_interior_nn_v1 [TensorRT] VERBOSE: Autotuning format combination: Int8(1,13,169:32,5408) -> Float(1,13,169,5070) [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (FusedConvActConvolution) [TensorRT] VERBOSE: FusedConvActConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CaskConvolution) [TensorRT] VERBOSE: CaskConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CudaConvolution) [TensorRT] VERBOSE: CudaConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CudaDepthwiseConvolution) [TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CudaGroupConvolution) [TensorRT] VERBOSE: CudaGroupConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: Autotuning format combination: Int8(1,13,169:32,5408) -> Float(1,13,169:32,169) [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (FusedConvActConvolution) [TensorRT] VERBOSE: FusedConvActConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CaskConvolution) [TensorRT] VERBOSE: CaskConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CudaConvolution) [TensorRT] VERBOSE: CudaConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CudaDepthwiseConvolution) [TensorRT] VERBOSE: CudaDepthwiseConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: Conv_147 || Conv_145 (CudaGroupConvolution) [TensorRT] VERBOSE: CudaGroupConvolution has no valid tactics for this config, skipping [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.009504 [TensorRT] VERBOSE: Tactic: 0 time 0.009376 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.009376 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.009472 [TensorRT] VERBOSE: Tactic: 0 time 0.007552 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.007552 [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.009792 [TensorRT] VERBOSE: Tactic: 0 time 0.005696 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.005696 [TensorRT] VERBOSE: Autotuning format combination: Float(1,13,169,5070) -> Float(1,1690) [TensorRT] VERBOSE: --------------- Timing Runner: Transpose_148 + Reshape_161 (Shuffle) [TensorRT] VERBOSE: Tactic: 0 time 0.005376 [TensorRT] VERBOSE: Tactic: 1 time 0.010976 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.005376 [TensorRT] VERBOSE: --------------- Timing Runner: 310 copy (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.005824 [TensorRT] VERBOSE: Tactic: 0 time 0.005376 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.005376 [TensorRT] VERBOSE: Autotuning format combination: Float(1,1690) -> Float(1,2) [TensorRT] VERBOSE: --------------- Timing Runner: Reshape_170 + (Unnamed Layer* 183) [Shuffle] (Shuffle) [TensorRT] VERBOSE: Tactic: 0 time 0.0056 [TensorRT] VERBOSE: Tactic: 1 time 0.011008 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.0056 [TensorRT] VERBOSE: Autotuning format combination: Float(1,2) -> Float(1,2) [TensorRT] VERBOSE: --------------- Timing Runner: (Unnamed Layer 184) [Softmax] (SoftMax) [TensorRT] VERBOSE: Tactic: 1001 time 0.079552 [TensorRT] VERBOSE: Fastest Tactic: 1001 Time: 0.079552 [TensorRT] VERBOSE: --------------- Timing Runner: (Unnamed Layer 184) [Softmax] (ExtSoftMax) [TensorRT] VERBOSE: Tactic: 0 time 0.01712 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.01712 [TensorRT] VERBOSE: >>>>>>>>>>>>>>> Chose Runner Type: ExtSoftMax Tactic: 0 [TensorRT] VERBOSE: [TensorRT] VERBOSE: --------------- Timing Runner: (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.009216 [TensorRT] VERBOSE: Tactic: 0 time 0.005376 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.005376 [TensorRT] VERBOSE: Autotuning format combination: Float(1,13,169,5070) -> Float(1,3380) [TensorRT] VERBOSE: --------------- Timing Runner: Transpose_146 + Reshape_154 (Shuffle) [TensorRT] VERBOSE: Tactic: 0 time 0.005632 [TensorRT] VERBOSE: Tactic: 1 time 0.012736 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.005632 [TensorRT] VERBOSE: --------------- Timing Runner: 301 copy (Reformat) [TensorRT] VERBOSE: Tactic: 1002 time 0.005792 [TensorRT] VERBOSE: Tactic: 0 time 0.0056 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.0056 [TensorRT] VERBOSE: Autotuning format combination: Float(1,3380) -> Float(1,4,3380) [TensorRT] VERBOSE: --------------- Timing Runner: Reshape_168 (Shuffle) [TensorRT] VERBOSE: Tactic: 0 time 0.005888 [TensorRT] VERBOSE: Tactic: 1 time 0.010944 [TensorRT] VERBOSE: Fastest Tactic: 0 Time: 0.005888 [TensorRT] VERBOSE: Adding reformat layer: Conv_0 reformatted input 0 (actual_input_1) from Float(1,416,173056,519168) to Int8(1,416,173056:4,173056) [TensorRT] VERBOSE: Adding reformat layer: Conv_4 reformatted input 0 (140) from Int8(1,208,43264:4,346112) to Int8(1,208,43264:32,43264) [TensorRT] VERBOSE: Adding reformat layer: Reshape_93 + Transpose_94 reformatted input 0 (204) from Int8(1,26,676:32,1352) to Float(1,26,676,43264) [TensorRT] VERBOSE: Adding reformat layer: Reshape_140 output to be reformatted 0 (285) from Int8(1,13,169:32,6760) to Float(1,13,169,216320) [TensorRT] VERBOSE: Adding reformat layer: Conv_147 || Conv_145 reformatted input 0 (289) from Int8(1,13,169:32,5408) to Int8(1,13,169:4,43264) [TensorRT] VERBOSE: For layer Reshape_93 + Transpose_94 a non-conforming implementation was chosen than was requested i.e. requested layer computation precision and output precision types were ignored because it resulted in faster network performance. Enable strict mode to try force choose a conforming implementation. [TensorRT] VERBOSE: For layer Reshape_108 + Transpose_109 a non-conforming implementation was chosen than was requested i.e. requested layer computation precision and output precision types were ignored because it resulted in faster network performance. Enable strict mode to try force choose a conforming implementation. [TensorRT] VERBOSE: For layer Reshape_123 + Transpose_124 a non-conforming implementation was chosen than was requested i.e. requested layer computation precision and output precision types were ignored because it resulted in faster network performance. Enable strict mode to try force choose a conforming implementation. [TensorRT] VERBOSE: For layer Reshape_140 a non-conforming implementation was chosen than was requested i.e. requested layer computation precision and output precision types were ignored because it resulted in faster network performance. Enable strict mode to try force choose a conforming implementation. [TensorRT] VERBOSE: Formats and tactics selection completed in 178.061 seconds. [TensorRT] VERBOSE: After reformat layers: 66 layers [TensorRT] VERBOSE: Block size 1073741824 [TensorRT] VERBOSE: Block size 5537792 [TensorRT] VERBOSE: Block size 5537792 [TensorRT] VERBOSE: Block size 1384448 [TensorRT] VERBOSE: Block size 43520 [TensorRT] VERBOSE: Total Activation Memory: 1086245376 [TensorRT] INFO: Detected 1 inputs and 2 output network tensors. [TensorRT] VERBOSE: Conv_0 (icudnn) Set Tactic Name: volta_int8x4_icudnn_int8x4_128x32_relu_small_nn_v1 [TensorRT] VERBOSE: Conv_4 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_medium_nt_v1 [TensorRT] VERBOSE: Conv_8 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_11 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_medium_nt_v1 [TensorRT] VERBOSE: Conv_14 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_18 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_21 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_24 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_28 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_31 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_34 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_37 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_128x128_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_40 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_65 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_44 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_47 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_50 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_53 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_56 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_59 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_62 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_singleBuffer_small_nt_v1 [TensorRT] VERBOSE: Conv_142 (i8816cudnn) Set Tactic Name: volta_int8_i8816cudnn_int8_256x64_ldg16_relu_small_nt_v1 [TensorRT] VERBOSE: Conv_147 || Conv_145 (icudnn) Set Tactic Name: volta_fp32_icudnn_int8x4_128x64_relu_interior_nn_v1 [TensorRT] VERBOSE: Layer: Conv_0 input reformatter 0 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_0 Weights: 0 HostPersistent: 1664 DevicePersistent: 1040384 [TensorRT] VERBOSE: Layer: LeakyRelu_2 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: MaxPool_3 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_4 input reformatter 0 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_4 Weights: 0 HostPersistent: 2176 DevicePersistent: 297984 [TensorRT] VERBOSE: Layer: LeakyRelu_6 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: MaxPool_7 Weights: 0 HostPersistent: 16 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_8 Weights: 0 HostPersistent: 1664 DevicePersistent: 215040 [TensorRT] VERBOSE: Layer: LeakyRelu_10 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_11 Weights: 0 HostPersistent: 2176 DevicePersistent: 82944 [TensorRT] VERBOSE: Layer: LeakyRelu_13 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_14 Weights: 0 HostPersistent: 1664 DevicePersistent: 215040 [TensorRT] VERBOSE: Layer: LeakyRelu_16 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: MaxPool_17 Weights: 0 HostPersistent: 16 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_18 Weights: 0 HostPersistent: 1664 DevicePersistent: 611328 [TensorRT] VERBOSE: Layer: LeakyRelu_20 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_21 Weights: 0 HostPersistent: 1664 DevicePersistent: 84480 [TensorRT] VERBOSE: Layer: LeakyRelu_23 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_24 Weights: 0 HostPersistent: 1664 DevicePersistent: 611328 [TensorRT] VERBOSE: Layer: LeakyRelu_26 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: MaxPool_27 Weights: 0 HostPersistent: 16 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_28 Weights: 0 HostPersistent: 1664 DevicePersistent: 2373632 [TensorRT] VERBOSE: Layer: LeakyRelu_30 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_31 Weights: 0 HostPersistent: 1664 DevicePersistent: 271360 [TensorRT] VERBOSE: Layer: LeakyRelu_33 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_34 Weights: 0 HostPersistent: 1664 DevicePersistent: 2373632 [TensorRT] VERBOSE: Layer: LeakyRelu_36 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_37 Weights: 0 HostPersistent: 1664 DevicePersistent: 271360 [TensorRT] VERBOSE: Layer: LeakyRelu_39 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_40 Weights: 0 HostPersistent: 1664 DevicePersistent: 2373632 [TensorRT] VERBOSE: Layer: LeakyRelu_42 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_65 Weights: 0 HostPersistent: 1664 DevicePersistent: 71168 [TensorRT] VERBOSE: Layer: MaxPool_43 Weights: 0 HostPersistent: 16 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_44 Weights: 0 HostPersistent: 1664 DevicePersistent: 9458688 [TensorRT] VERBOSE: Layer: LeakyRelu_46 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_47 Weights: 0 HostPersistent: 1664 DevicePersistent: 1059840 [TensorRT] VERBOSE: Layer: LeakyRelu_49 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_50 Weights: 0 HostPersistent: 1664 DevicePersistent: 9458688 [TensorRT] VERBOSE: Layer: LeakyRelu_52 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_53 Weights: 0 HostPersistent: 1664 DevicePersistent: 1059840 [TensorRT] VERBOSE: Layer: LeakyRelu_55 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_56 Weights: 0 HostPersistent: 1664 DevicePersistent: 9458688 [TensorRT] VERBOSE: Layer: LeakyRelu_58 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_59 Weights: 0 HostPersistent: 1664 DevicePersistent: 18895872 [TensorRT] VERBOSE: Layer: LeakyRelu_61 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_62 Weights: 0 HostPersistent: 1664 DevicePersistent: 18895872 [TensorRT] VERBOSE: Layer: LeakyRelu_67 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Reshape_93 + Transpose_94 input reformatter 0 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Reshape_93 + Transpose_94 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Reshape_108 + Transpose_109 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Reshape_123 + Transpose_124 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Reshape_140 output reformatter 0 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: LeakyRelu_64 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_142 Weights: 0 HostPersistent: 1664 DevicePersistent: 23614464 [TensorRT] VERBOSE: Layer: LeakyRelu_144 Weights: 0 HostPersistent: 300 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_147 || Conv_145 input reformatter 0 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Conv_147 || Conv_145 Weights: 0 HostPersistent: 3200 DevicePersistent: 33280 [TensorRT] VERBOSE: Layer: Transpose_148 + Reshape_161 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: 310 copy Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: (Unnamed Layer 184) [Softmax] Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Transpose_146 + Reshape_154 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: 301 copy Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Layer: Reshape_168 Weights: 0 HostPersistent: 0 DevicePersistent: 0 [TensorRT] VERBOSE: Total Host Persistent Memory: 47496 [TensorRT] VERBOSE: Total Device Persistent Memory: 102828544 [TensorRT] VERBOSE: Total Weight Memory: 0 [TensorRT] VERBOSE: Builder timing cache: created 293 entries, 402 hit(s) [TensorRT] VERBOSE: Engine generation completed in 178.845 seconds. [TensorRT] VERBOSE: Engine Layer Information: [TensorRT] VERBOSE: Layer(Reformat): Conv_0 input reformatter 0, Tactic: 0, actual_input_1[Float(3,416,416)] -> Conv_0 reformatted input 0[Int8(3,416,416)] [TensorRT] VERBOSE: Layer(icudnn): Conv_0, Tactic: -6282183216199417697, Conv_0 reformatted input 0[Int8(3,416,416)] -> 138[Int8(32,416,416)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_2, Tactic: 6, 138[Int8(32,416,416)] -> 139[Int8(32,416,416)] [TensorRT] VERBOSE: Layer(PoolingTiled): MaxPool_3, Tactic: 6291713, 139[Int8(32,416,416)] -> 140[Int8(32,208,208)] [TensorRT] VERBOSE: Layer(Reformat): Conv_4 input reformatter 0, Tactic: 0, 140[Int8(32,208,208)] -> Conv_4 reformatted input 0[Int8(32,208,208)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_4, Tactic: -9114895246540757312, Conv_4 reformatted input 0[Int8(32,208,208)] -> 142[Int8(64,208,208)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_6, Tactic: 27, 142[Int8(64,208,208)] -> 143[Int8(64,208,208)] [TensorRT] VERBOSE: Layer(Pooling): MaxPool_7, Tactic: -4, 143[Int8(64,208,208)] -> 144[Int8(64,104,104)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_8, Tactic: -182858804213663094, 144[Int8(64,104,104)] -> 146[Int8(128,104,104)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_10, Tactic: 27, 146[Int8(128,104,104)] -> 147[Int8(128,104,104)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_11, Tactic: -9114895246540757312, 147[Int8(128,104,104)] -> 149[Int8(64,104,104)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_13, Tactic: 26, 149[Int8(64,104,104)] -> 150[Int8(64,104,104)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_14, Tactic: -182858804213663094, 150[Int8(64,104,104)] -> 152[Int8(128,104,104)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_16, Tactic: 27, 152[Int8(128,104,104)] -> 153[Int8(128,104,104)] [TensorRT] VERBOSE: Layer(Pooling): MaxPool_17, Tactic: -4, 153[Int8(128,104,104)] -> 154[Int8(128,52,52)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_18, Tactic: -182858804213663094, 154[Int8(128,52,52)] -> 156[Int8(256,52,52)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_20, Tactic: 27, 156[Int8(256,52,52)] -> 157[Int8(256,52,52)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_21, Tactic: -182858804213663094, 157[Int8(256,52,52)] -> 159[Int8(128,52,52)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_23, Tactic: 23, 159[Int8(128,52,52)] -> 160[Int8(128,52,52)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_24, Tactic: -182858804213663094, 160[Int8(128,52,52)] -> 162[Int8(256,52,52)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_26, Tactic: 27, 162[Int8(256,52,52)] -> 163[Int8(256,52,52)] [TensorRT] VERBOSE: Layer(Pooling): MaxPool_27, Tactic: -4, 163[Int8(256,52,52)] -> 164[Int8(256,26,26)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_28, Tactic: -2102888629196925141, 164[Int8(256,26,26)] -> 166[Int8(512,26,26)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_30, Tactic: 27, 166[Int8(512,26,26)] -> 167[Int8(512,26,26)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_31, Tactic: -182858804213663094, 167[Int8(512,26,26)] -> 169[Int8(256,26,26)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_33, Tactic: 22, 169[Int8(256,26,26)] -> 170[Int8(256,26,26)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_34, Tactic: -2102888629196925141, 170[Int8(256,26,26)] -> 172[Int8(512,26,26)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_36, Tactic: 21, 172[Int8(512,26,26)] -> 173[Int8(512,26,26)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_37, Tactic: -182858804213663094, 173[Int8(512,26,26)] -> 175[Int8(256,26,26)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_39, Tactic: 22, 175[Int8(256,26,26)] -> 176[Int8(256,26,26)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_40, Tactic: -2102888629196925141, 176[Int8(256,26,26)] -> 178[Int8(512,26,26)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_42, Tactic: 27, 178[Int8(512,26,26)] -> 179[Int8(512,26,26)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_65, Tactic: 7039764449991095921, 179[Int8(512,26,26)] -> 203[Int8(64,26,26)] [TensorRT] VERBOSE: Layer(Pooling): MaxPool_43, Tactic: -4, 179[Int8(512,26,26)] -> 180[Int8(512,13,13)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_44, Tactic: -2102888629196925141, 180[Int8(512,13,13)] -> 182[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_46, Tactic: 27, 182[Int8(1024,13,13)] -> 183[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_47, Tactic: -2102888629196925141, 183[Int8(1024,13,13)] -> 185[Int8(512,13,13)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_49, Tactic: 22, 185[Int8(512,13,13)] -> 186[Int8(512,13,13)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_50, Tactic: -2102888629196925141, 186[Int8(512,13,13)] -> 188[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_52, Tactic: 22, 188[Int8(1024,13,13)] -> 189[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_53, Tactic: -2102888629196925141, 189[Int8(1024,13,13)] -> 191[Int8(512,13,13)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_55, Tactic: 11, 191[Int8(512,13,13)] -> 192[Int8(512,13,13)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_56, Tactic: -2102888629196925141, 192[Int8(512,13,13)] -> 194[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_58, Tactic: 22, 194[Int8(1024,13,13)] -> 195[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_59, Tactic: -2102888629196925141, 195[Int8(1024,13,13)] -> 197[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_61, Tactic: 22, 197[Int8(1024,13,13)] -> 198[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_62, Tactic: -2102888629196925141, 198[Int8(1024,13,13)] -> 200[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_67, Tactic: 25, 203[Int8(64,26,26)] -> 204[Int8(64,26,26)] [TensorRT] VERBOSE: Layer(Reformat): Reshape_93 + Transpose_94 input reformatter 0, Tactic: 0, 204[Int8(64,26,26)] -> Reshape_93 + Transpose_94 reformatted input 0[Float(64,26,26)] [TensorRT] VERBOSE: Layer(Shuffle): Reshape_93 + Transpose_94, Tactic: 0, Reshape_93 + Transpose_94 reformatted input 0[Float(64,26,26)] -> 235[Float(64,13,13,2,2)] [TensorRT] VERBOSE: Layer(Shuffle): Reshape_108 + Transpose_109, Tactic: 0, 235[Float(64,13,13,2,2)] -> 252[Float(64,4,169)] [TensorRT] VERBOSE: Layer(Shuffle): Reshape_123 + Transpose_124, Tactic: 0, 252[Float(64,4,169)] -> 269[Float(4,64,13,13)] [TensorRT] VERBOSE: Layer(Reformat): Reshape_140 output reformatter 0, Tactic: 1002, Reshape_140 output to be reformatted 0[Float(256,13,13)] -> 286[Int8(256,13,13)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_64, Tactic: 21, 200[Int8(1024,13,13)] -> 286[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(i8816cudnn): Conv_142, Tactic: 7039764449991095921, 286[Int8(1280,13,13)] -> 288[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(PointWiseV2): LeakyRelu_144, Tactic: 21, 288[Int8(1024,13,13)] -> 289[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(Reformat): Conv_147 || Conv_145 input reformatter 0, Tactic: 0, 289[Int8(1024,13,13)] -> Conv_147 || Conv_145 reformatted input 0[Int8(1024,13,13)] [TensorRT] VERBOSE: Layer(icudnn): Conv_147 || Conv_145, Tactic: -7924103240988931433, Conv_147 || Conv_145 reformatted input 0[Int8(1024,13,13)] -> Conv_147 || Conv_145[Float(30,13,13)] [TensorRT] VERBOSE: Layer(Shuffle): Transpose_148 + Reshape_161, Tactic: 0, Conv_147 || Conv_145[Float(10,13,13)] -> 310[Float(1690)] [TensorRT] VERBOSE: Layer(Reformat): 310 copy, Tactic: 0, 310[Float(1690)] -> 311[Float(1690)] [TensorRT] VERBOSE: Layer(ExtSoftMax): (Unnamed Layer 184) [Softmax], Tactic: 0, (Unnamed Layer* 183) [Shuffle]_output[Float(2)] -> output_conf[Float(2)] [TensorRT] VERBOSE: Layer(Shuffle): Transpose_146 + Reshape_154, Tactic: 0, Conv_147 || Conv_145[Float(20,13,13)] -> 301[Float(3380)] [TensorRT] VERBOSE: Layer(Reformat): 301 copy, Tactic: 0, 301[Float(3380)] -> 302[Float(3380)] [TensorRT] VERBOSE: Layer(Shuffle): Reshape_168, Tactic: 0, 302[Float(3380)] -> output_loc[Float(845,4)] Completed creating the engine onnx to tensorrt completed

ttyio commented 3 years ago

Hello @QZ-cmd , TRT only provide calibration algorithm to generate the INT8 scales, and inference solution to run the network in given precision. So for most of int8 accuracy issue, the TRT functionality is OK, the problem might:

So this is the reason why no build failure for INT8, but accuracy drops.

Besides explorer the different calibration methods. Another approach you can do is to enable mixed precision, and mark some of the layer run on higher precision (half, float) while rest of the network run on INT8. Here is some code that you could follow: https://github.com/NVIDIA/TensorRT/blob/release/7.1/demo/BERT/builder.py#L590 https://github.com/NVIDIA/TensorRT/blob/release/7.1/demo/BERT/builder.py#L245

If you donot know which layers are sensitive to accuracy, you could even choose half of the network run on FP32, and then use divide-and-conquer to solve this issue.

ttyio commented 3 years ago

Close since no activity for more than 3 weeks, please reopen if you still have question, thanks!

315386775 commented 2 years ago

@QZ-cmd have you sloved it.