Closed AdanWang closed 1 year ago
[ { "count" : 1329 } , { "name" : "QuantizeLinear_111", "timeMs" : 115.151, "averageMs" : 0.0866452, "medianMs" : 0.08704, "percentage" : 1.09893 } , { "name" : "pts_backbone.blocks.0.0.weight + QuantizeLinear_116 + Conv_118 + Relu_119", "timeMs" : 364.868, "averageMs" : 0.274543, "medianMs" : 0.270336, "percentage" : 3.48206 } , { "name" : "pts_backbone.blocks.0.3.weight + QuantizeLinear_129 + Conv_131 + Relu_132", "timeMs" : 364.141, "averageMs" : 0.273996, "medianMs" : 0.269312, "percentage" : 3.47513 } , { "name" : "pts_backbone.blocks.0.6.weight + QuantizeLinear_142 + Conv_144 + Relu_145", "timeMs" : 364.155, "averageMs" : 0.274007, "medianMs" : 0.270336, "percentage" : 3.47526 } , { "name" : "pts_backbone.blocks.0.9.weight + QuantizeLinear_155 + Conv_157 + Relu_158", "timeMs" : 363.725, "averageMs" : 0.273683, "medianMs" : 0.269312, "percentage" : 3.47116 } , { "name" : "pts_neck.deblocks.0.0.weight + QuantizeLinear_324 + Conv_326 + Relu_327", "timeMs" : 102.013, "averageMs" : 0.0767592, "medianMs" : 0.075776, "percentage" : 0.973546 } , { "name" : "pts_backbone.blocks.1.0.weight + QuantizeLinear_168 + Conv_170 + Relu_171", "timeMs" : 195.417, "averageMs" : 0.147041, "medianMs" : 0.142336, "percentage" : 1.86493 } , { "name" : "pts_backbone.blocks.1.3.weight + QuantizeLinear_181 + Conv_183 + Relu_184", "timeMs" : 355.65, "averageMs" : 0.267607, "medianMs" : 0.26624, "percentage" : 3.39409 } , { "name" : "pts_backbone.blocks.1.6.weight + QuantizeLinear_194 + Conv_196 + Relu_197", "timeMs" : 354.579, "averageMs" : 0.266801, "medianMs" : 0.265216, "percentage" : 3.38387 } , { "name" : "pts_backbone.blocks.1.9.weight + QuantizeLinear_207 + Conv_209 + Relu_210", "timeMs" : 354.532, "averageMs" : 0.266766, "medianMs" : 0.265216, "percentage" : 3.38342 } , { "name" : "pts_backbone.blocks.1.12.weight + QuantizeLinear_220 + Conv_222 + Relu_223", "timeMs" : 354.55, "averageMs" : 0.26678, "medianMs" : 0.265216, "percentage" : 3.3836 } , { "name" : "pts_backbone.blocks.1.15.weight + QuantizeLinear_233 + Conv_235 + Relu_236", "timeMs" : 355.573, "averageMs" : 0.267549, "medianMs" : 0.26624, "percentage" : 3.39336 } , { "name" : "pts_neck.deblocks.1.0.weight + QuantizeLinear_337 + Conv_339 + Relu_340", "timeMs" : 66.1585, "averageMs" : 0.0497806, "medianMs" : 0.049152, "percentage" : 0.631374 } , { "name" : "pts_backbone.blocks.2.0.weight + QuantizeLinear_246 + Conv_248 + Relu_249", "timeMs" : 186.386, "averageMs" : 0.140245, "medianMs" : 0.13824, "percentage" : 1.77874 } , { "name" : "pts_backbone.blocks.2.3.weight + QuantizeLinear_259 + Conv_261 + Relu_262", "timeMs" : 362.567, "averageMs" : 0.272812, "medianMs" : 0.26624, "percentage" : 3.46011 } , { "name" : "pts_backbone.blocks.2.6.weight + QuantizeLinear_272 + Conv_274 + Relu_275", "timeMs" : 366.116, "averageMs" : 0.275482, "medianMs" : 0.267264, "percentage" : 3.49398 } , { "name" : "pts_backbone.blocks.2.9.weight + QuantizeLinear_285 + Conv_287 + Relu_288", "timeMs" : 362.146, "averageMs" : 0.272495, "medianMs" : 0.26624, "percentage" : 3.45609 } , { "name" : "pts_backbone.blocks.2.12.weight + QuantizeLinear_298 + Conv_300 + Relu_301", "timeMs" : 364.159, "averageMs" : 0.274009, "medianMs" : 0.26624, "percentage" : 3.4753 } , { "name" : "pts_backbone.blocks.2.15.weight + QuantizeLinear_311 + Conv_313 + Relu_314", "timeMs" : 361.934, "averageMs" : 0.272336, "medianMs" : 0.26624, "percentage" : 3.45407 } , { "name" : "pts_neck.deblocks.2.0.weight + QuantizeLinear_350 + ConvTranspose_352", "timeMs" : 624.911, "averageMs" : 0.470211, "medianMs" : 0.468992, "percentage" : 5.96375 } , { "name" : "BatchNormalization_353 + Relu_354", "timeMs" : 94.4735, "averageMs" : 0.0710862, "medianMs" : 0.070656, "percentage" : 0.901595 } , { "name" : "QuantizeLinear_360_clone_2", "timeMs" : 64.5658, "averageMs" : 0.0485823, "medianMs" : 0.048128, "percentage" : 0.616175 } , { "name" : "pts_bbox_head.shared_conv.conv.weight + QuantizeLinear_365 + Conv_367 + Relu_368", "timeMs" : 540.033, "averageMs" : 0.406345, "medianMs" : 0.39424, "percentage" : 5.15373 } , { "name" : "pts_bbox_head.task_heads.3.heatmap.0.conv.weight + QuantizeLinear_860 + Conv_862 + Relu_863 || pts_bbox_head.task_heads.3.rot.0.conv.weight + QuantizeLinear_835 + Conv_837 + Relu_838 || pts_bbox_head.task_heads.3.dim.0.conv.weight + QuantizeLinear_810 + Conv_812 + Relu_813 || pts_bbox_head.task_heads.3.height.0.conv.weight + QuantizeLinear_784 + Conv_786 + Relu_787 || pts_bbox_head.task_heads.3.reg.0.conv.weight + QuantizeLinear_759 + Conv_761 + Relu_762 || pts_bbox_head.task_heads.2.heatmap.0.conv.weight + QuantizeLinear_733 + Conv_735 + Relu_736 || pts_bbox_head.task_heads.2.rot.0.conv.weight + QuantizeLinear_708 + Conv_710 + Relu_711 || pts_bbox_head.task_heads.2.dim.0.conv.weight + QuantizeLinear_683 + Conv_685 + Relu_686", "timeMs" : 718.96, "averageMs" : 0.540978, "medianMs" : 0.543744, "percentage" : 6.8613 } , { "name" : "pts_bbox_head.task_heads.2.height.0.conv.weight + QuantizeLinear_657 + Conv_659 + Relu_660 || pts_bbox_head.task_heads.2.reg.0.conv.weight + QuantizeLinear_632 + Conv_634 + Relu_635 || pts_bbox_head.task_heads.1.heatmap.0.conv.weight + QuantizeLinear_606 + Conv_608 + Relu_609 || pts_bbox_head.task_heads.1.rot.0.conv.weight + QuantizeLinear_581 + Conv_583 + Relu_584 || pts_bbox_head.task_heads.1.dim.0.conv.weight + QuantizeLinear_556 + Conv_558 + Relu_559 || pts_bbox_head.task_heads.1.height.0.conv.weight + QuantizeLinear_530 + Conv_532 + Relu_533 || pts_bbox_head.task_heads.1.reg.0.conv.weight + QuantizeLinear_505 + Conv_507 + Relu_508 || pts_bbox_head.task_heads.0.heatmap.0.conv.weight + QuantizeLinear_479 + Conv_481 + Relu_482", "timeMs" : 712.105, "averageMs" : 0.53582, "medianMs" : 0.538624, "percentage" : 6.79587 } , { "name" : "pts_bbox_head.task_heads.0.rot.0.conv.weight + QuantizeLinear_454 + Conv_456 + Relu_457 || pts_bbox_head.task_heads.0.dim.0.conv.weight + QuantizeLinear_429 + Conv_431 + Relu_432 || pts_bbox_head.task_heads.0.reg.0.conv.weight + QuantizeLinear_378 + Conv_380 + Relu_381", "timeMs" : 276.612, "averageMs" : 0.208135, "medianMs" : 0.205824, "percentage" : 2.6398 } , { "name" : "pts_bbox_head.task_heads.0.height.0.conv.weight + QuantizeLinear_403 + Conv_405 + Relu_406", "timeMs" : 101.168, "averageMs" : 0.0761233, "medianMs" : 0.075776, "percentage" : 0.965482 } , { "name" : "Reformatting CopyNode for Output Tensor 0 to pts_bbox_head.task_heads.0.height.0.conv.weight + QuantizeLinear_403 + Conv_405 + Relu_406", "timeMs" : 55.6095, "averageMs" : 0.0418431, "medianMs" : 0.041984, "percentage" : 0.530701 } , { "name" : "pts_bbox_head.task_heads.0.reg.1.weight + QuantizeLinear_391 + Conv_393", "timeMs" : 78.5317, "averageMs" : 0.0590908, "medianMs" : 0.059392, "percentage" : 0.749456 } , { "name" : "Reformatting CopyNode for Input Tensor 0 to pts_bbox_head.task_heads.0.height.1.weight + QuantizeLinear_417 + Conv_419", "timeMs" : 35.9069, "averageMs" : 0.027018, "medianMs" : 0.026624, "percentage" : 0.342672 } , { "name" : "pts_bbox_head.task_heads.0.height.1.weight + QuantizeLinear_417 + Conv_419", "timeMs" : 72.8603, "averageMs" : 0.0548234, "medianMs" : 0.054272, "percentage" : 0.695332 } , { "name" : "pts_bbox_head.task_heads.0.dim.1.weight + QuantizeLinear_442 + Conv_444", "timeMs" : 73.5907, "averageMs" : 0.055373, "medianMs" : 0.055296, "percentage" : 0.702302 } , { "name" : "PWN(Exp_885)", "timeMs" : 8.46949, "averageMs" : 0.00637283, "medianMs" : 0.006144, "percentage" : 0.0808274 } , { "name" : "pts_bbox_head.task_heads.0.rot.1.weight + QuantizeLinear_467 + Conv_469", "timeMs" : 75.0195, "averageMs" : 0.0564481, "medianMs" : 0.05632, "percentage" : 0.715938 } , { "name" : "pts_bbox_head.task_heads.0.heatmap.1.weight + QuantizeLinear_493 + Conv_495", "timeMs" : 72.8726, "averageMs" : 0.0548326, "medianMs" : 0.054272, "percentage" : 0.695449 } , { "name" : "pts_bbox_head.task_heads.1.reg.1.weight + QuantizeLinear_518 + Conv_520", "timeMs" : 73.4473, "averageMs" : 0.0552651, "medianMs" : 0.055296, "percentage" : 0.700934 } , { "name" : "pts_bbox_head.task_heads.1.height.1.weight + QuantizeLinear_544 + Conv_546", "timeMs" : 72.9791, "averageMs" : 0.0549128, "medianMs" : 0.054272, "percentage" : 0.696466 } , { "name" : "pts_bbox_head.task_heads.1.dim.1.weight + QuantizeLinear_569 + Conv_571", "timeMs" : 74.3974, "averageMs" : 0.05598, "medianMs" : 0.055296, "percentage" : 0.710001 } , { "name" : "PWN(Exp_894)", "timeMs" : 8.59645, "averageMs" : 0.00646836, "medianMs" : 0.006144, "percentage" : 0.082039 } , { "name" : "pts_bbox_head.task_heads.1.rot.1.weight + QuantizeLinear_594 + Conv_596", "timeMs" : 73.2762, "averageMs" : 0.0551364, "medianMs" : 0.055296, "percentage" : 0.699301 } , { "name" : "pts_bbox_head.task_heads.1.heatmap.1.weight + QuantizeLinear_620 + Conv_622", "timeMs" : 73.5139, "averageMs" : 0.0553152, "medianMs" : 0.055296, "percentage" : 0.70157 } , { "name" : "pts_bbox_head.task_heads.2.reg.1.weight + QuantizeLinear_645 + Conv_647", "timeMs" : 73.9685, "averageMs" : 0.0556573, "medianMs" : 0.055296, "percentage" : 0.705908 } , { "name" : "pts_bbox_head.task_heads.2.height.1.weight + QuantizeLinear_671 + Conv_673", "timeMs" : 73.0457, "averageMs" : 0.0549629, "medianMs" : 0.054272, "percentage" : 0.697101 } , { "name" : "pts_bbox_head.task_heads.2.dim.1.weight + QuantizeLinear_696 + Conv_698", "timeMs" : 73.2783, "averageMs" : 0.055138, "medianMs" : 0.055296, "percentage" : 0.699322 } , { "name" : "PWN(Exp_903)", "timeMs" : 8.60567, "averageMs" : 0.00647529, "medianMs" : 0.006144, "percentage" : 0.082127 } , { "name" : "pts_bbox_head.task_heads.2.rot.1.weight + QuantizeLinear_721 + Conv_723", "timeMs" : 73.6224, "averageMs" : 0.0553968, "medianMs" : 0.055296, "percentage" : 0.702605 } , { "name" : "pts_bbox_head.task_heads.2.heatmap.1.weight + QuantizeLinear_747 + Conv_749", "timeMs" : 72.9269, "averageMs" : 0.0548735, "medianMs" : 0.054272, "percentage" : 0.695968 } , { "name" : "pts_bbox_head.task_heads.3.reg.1.weight + QuantizeLinear_772 + Conv_774", "timeMs" : 73.4781, "averageMs" : 0.0552882, "medianMs" : 0.055296, "percentage" : 0.701227 } , { "name" : "pts_bbox_head.task_heads.3.height.1.weight + QuantizeLinear_798 + Conv_800", "timeMs" : 73.4033, "averageMs" : 0.055232, "medianMs" : 0.055296, "percentage" : 0.700514 } , { "name" : "pts_bbox_head.task_heads.3.dim.1.weight + QuantizeLinear_823 + Conv_825", "timeMs" : 74.5949, "averageMs" : 0.0561286, "medianMs" : 0.05632, "percentage" : 0.711886 } , { "name" : "PWN(Exp_912)", "timeMs" : 8.59747, "averageMs" : 0.00646913, "medianMs" : 0.006144, "percentage" : 0.0820488 } , { "name" : "pts_bbox_head.task_heads.3.rot.1.weight + QuantizeLinear_848 + Conv_850", "timeMs" : 72.0511, "averageMs" : 0.0542145, "medianMs" : 0.054272, "percentage" : 0.687609 } , { "name" : "pts_bbox_head.task_heads.3.heatmap.1.weight + QuantizeLinear_874 + Conv_876", "timeMs" : 74.1088, "averageMs" : 0.0557628, "medianMs" : 0.055296, "percentage" : 0.707247 } , { "name" : "{ForeignNode[onnx::Gather_1010...Concat_913]}", "timeMs" : 31.0906, "averageMs" : 0.023394, "medianMs" : 0.023552, "percentage" : 0.296708 } ]
{"Layers": ["QuantizeLinear_111" ,"pts_backbone.blocks.0.0.weight + QuantizeLinear_116 + Conv_118 + Relu_119" ,"pts_backbone.blocks.0.3.weight + QuantizeLinear_129 + Conv_131 + Relu_132" ,"pts_backbone.blocks.0.6.weight + QuantizeLinear_142 + Conv_144 + Relu_145" ,"pts_backbone.blocks.0.9.weight + QuantizeLinear_155 + Conv_157 + Relu_158" ,"pts_neck.deblocks.0.0.weight + QuantizeLinear_324 + Conv_326 + Relu_327" ,"pts_backbone.blocks.1.0.weight + QuantizeLinear_168 + Conv_170 + Relu_171" ,"pts_backbone.blocks.1.3.weight + QuantizeLinear_181 + Conv_183 + Relu_184" ,"pts_backbone.blocks.1.6.weight + QuantizeLinear_194 + Conv_196 + Relu_197" ,"pts_backbone.blocks.1.9.weight + QuantizeLinear_207 + Conv_209 + Relu_210" ,"pts_backbone.blocks.1.12.weight + QuantizeLinear_220 + Conv_222 + Relu_223" ,"pts_backbone.blocks.1.15.weight + QuantizeLinear_233 + Conv_235 + Relu_236" ,"pts_neck.deblocks.1.0.weight + QuantizeLinear_337 + Conv_339 + Relu_340" ,"pts_backbone.blocks.2.0.weight + QuantizeLinear_246 + Conv_248 + Relu_249" ,"pts_backbone.blocks.2.3.weight + QuantizeLinear_259 + Conv_261 + Relu_262" ,"pts_backbone.blocks.2.6.weight + QuantizeLinear_272 + Conv_274 + Relu_275" ,"pts_backbone.blocks.2.9.weight + QuantizeLinear_285 + Conv_287 + Relu_288" ,"pts_backbone.blocks.2.12.weight + QuantizeLinear_298 + Conv_300 + Relu_301" ,"pts_backbone.blocks.2.15.weight + QuantizeLinear_311 + Conv_313 + Relu_314" ,"pts_neck.deblocks.2.0.weight + QuantizeLinear_350 + ConvTranspose_352" ,"BatchNormalization_353 + Relu_354" ,"QuantizeLinear_360_clone_2" ,"pts_bbox_head.shared_conv.conv.weight + QuantizeLinear_365 + Conv_367 + Relu_368" ,"pts_bbox_head.task_heads.3.heatmap.0.conv.weight + QuantizeLinear_860 + Conv_862 + Relu_863 || pts_bbox_head.task_heads.3.rot.0.conv.weight + QuantizeLinear_835 + Conv_837 + Relu_838 || pts_bbox_head.task_heads.3.dim.0.conv.weight + QuantizeLinear_810 + Conv_812 + Relu_813 || pts_bbox_head.task_heads.3.height.0.conv.weight + QuantizeLinear_784 + Conv_786 + Relu_787 || pts_bbox_head.task_heads.3.reg.0.conv.weight + QuantizeLinear_759 + Conv_761 + Relu_762 || pts_bbox_head.task_heads.2.heatmap.0.conv.weight + QuantizeLinear_733 + Conv_735 + Relu_736 || pts_bbox_head.task_heads.2.rot.0.conv.weight + QuantizeLinear_708 + Conv_710 + Relu_711 || pts_bbox_head.task_heads.2.dim.0.conv.weight + QuantizeLinear_683 + Conv_685 + Relu_686" ,"pts_bbox_head.task_heads.2.height.0.conv.weight + QuantizeLinear_657 + Conv_659 + Relu_660 || pts_bbox_head.task_heads.2.reg.0.conv.weight + QuantizeLinear_632 + Conv_634 + Relu_635 || pts_bbox_head.task_heads.1.heatmap.0.conv.weight + QuantizeLinear_606 + Conv_608 + Relu_609 || pts_bbox_head.task_heads.1.rot.0.conv.weight + QuantizeLinear_581 + Conv_583 + Relu_584 || pts_bbox_head.task_heads.1.dim.0.conv.weight + QuantizeLinear_556 + Conv_558 + Relu_559 || pts_bbox_head.task_heads.1.height.0.conv.weight + QuantizeLinear_530 + Conv_532 + Relu_533 || pts_bbox_head.task_heads.1.reg.0.conv.weight + QuantizeLinear_505 + Conv_507 + Relu_508 || pts_bbox_head.task_heads.0.heatmap.0.conv.weight + QuantizeLinear_479 + Conv_481 + Relu_482" ,"pts_bbox_head.task_heads.0.rot.0.conv.weight + QuantizeLinear_454 + Conv_456 + Relu_457 || pts_bbox_head.task_heads.0.dim.0.conv.weight + QuantizeLinear_429 + Conv_431 + Relu_432 || pts_bbox_head.task_heads.0.reg.0.conv.weight + QuantizeLinear_378 + Conv_380 + Relu_381" ,"pts_bbox_head.task_heads.0.height.0.conv.weight + QuantizeLinear_403 + Conv_405 + Relu_406" ,"Reformatting CopyNode for Output Tensor 0 to pts_bbox_head.task_heads.0.height.0.conv.weight + QuantizeLinear_403 + Conv_405 + Relu_406" ,"pts_bbox_head.task_heads.0.reg.1.weight + QuantizeLinear_391 + Conv_393" ,"Reformatting CopyNode for Input Tensor 0 to pts_bbox_head.task_heads.0.height.1.weight + QuantizeLinear_417 + Conv_419" ,"pts_bbox_head.task_heads.0.height.1.weight + QuantizeLinear_417 + Conv_419" ,"pts_bbox_head.task_heads.0.dim.1.weight + QuantizeLinear_442 + Conv_444" ,"PWN(Exp_885)" ,"pts_bbox_head.task_heads.0.rot.1.weight + QuantizeLinear_467 + Conv_469" ,"pts_bbox_head.task_heads.0.heatmap.1.weight + QuantizeLinear_493 + Conv_495" ,"pts_bbox_head.task_heads.1.reg.1.weight + QuantizeLinear_518 + Conv_520" ,"pts_bbox_head.task_heads.1.height.1.weight + QuantizeLinear_544 + Conv_546" ,"pts_bbox_head.task_heads.1.dim.1.weight + QuantizeLinear_569 + Conv_571" ,"PWN(Exp_894)" ,"pts_bbox_head.task_heads.1.rot.1.weight + QuantizeLinear_594 + Conv_596" ,"pts_bbox_head.task_heads.1.heatmap.1.weight + QuantizeLinear_620 + Conv_622" ,"pts_bbox_head.task_heads.2.reg.1.weight + QuantizeLinear_645 + Conv_647" ,"pts_bbox_head.task_heads.2.height.1.weight + QuantizeLinear_671 + Conv_673" ,"pts_bbox_head.task_heads.2.dim.1.weight + QuantizeLinear_696 + Conv_698" ,"PWN(Exp_903)" ,"pts_bbox_head.task_heads.2.rot.1.weight + QuantizeLinear_721 + Conv_723" ,"pts_bbox_head.task_heads.2.heatmap.1.weight + QuantizeLinear_747 + Conv_749" ,"pts_bbox_head.task_heads.3.reg.1.weight + QuantizeLinear_772 + Conv_774" ,"pts_bbox_head.task_heads.3.height.1.weight + QuantizeLinear_798 + Conv_800" ,"pts_bbox_head.task_heads.3.dim.1.weight + QuantizeLinear_823 + Conv_825" ,"PWN(Exp_912)" ,"pts_bbox_head.task_heads.3.rot.1.weight + QuantizeLinear_848 + Conv_850" ,"pts_bbox_head.task_heads.3.heatmap.1.weight + QuantizeLinear_874 + Conv_876" ,"{ForeignNode[onnx::Gather_1010...Concat_913]}" ], "Bindings": ["inp" ,"onnx::DequantizeLinear_544" ,"rotc" ,"rots" ,"height" ,"dim" ,"reg" ,"heatmap" ]} That is how I modify the model , this will make the output of the height[0, 0, :, :] become correct.
I don't quite understand the issue here, could you give a detailed reproduce? Thanks
FYI: check it with TRT 8.5.1.7(docker 22.12).
[I] Accuracy Comparison | trt-runner-N0-03/28/23-13:32:44 vs. onnxrt-runner-N0-03/28/23-13:32:44
[I] Comparing Output: 'rotc' (dtype=float32, shape=(1, 4, 160, 320)) with 'rotc' (dtype=float32, shape=(1, 4, 160, 320))
[I] Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I] trt-runner-N0-03/28/23-13:32:44: rotc | Stats: mean=-0.014114, std-dev=0.53309, var=0.28419, median=-0.13501, min=-1.1774 at (0, 3, 80, 309), max=1.16 at (0, 2, 5, 247), avg-magnitude=0.43398
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(-1.18 , -0.944 ) | 1781 | #
(-0.944 , -0.71 ) | 17061 | ##########
(-0.71 , -0.476 ) | 20764 | ############
(-0.476 , -0.242 ) | 27990 | #################
(-0.242 , -0.00872) | 64901 | ########################################
(-0.00872, 0.225 ) | 16286 | ##########
(0.225 , 0.459 ) | 7238 | ####
(0.459 , 0.693 ) | 9445 | #####
(0.693 , 0.926 ) | 30803 | ##################
(0.926 , 1.16 ) | 8531 | #####
[I] onnxrt-runner-N0-03/28/23-13:32:44: rotc | Stats: mean=-0.014614, std-dev=0.53311, var=0.28421, median=-0.1359, min=-1.1762 at (0, 3, 80, 309), max=1.16 at (0, 2, 5, 247), avg-magnitude=0.43415
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(-1.18 , -0.944 ) | 1779 | #
(-0.944 , -0.71 ) | 17057 | ##########
(-0.71 , -0.476 ) | 20819 | ############
(-0.476 , -0.242 ) | 28031 | #################
(-0.242 , -0.00872) | 64988 | ########################################
(-0.00872, 0.225 ) | 16179 | #########
(0.225 , 0.459 ) | 7184 | ####
(0.459 , 0.693 ) | 9418 | #####
(0.693 , 0.926 ) | 30816 | ##################
(0.926 , 1.16 ) | 8529 | #####
[I] Error Metrics: rotc
[I] Minimum Required Tolerance: elemwise error | [abs=0.34744] OR [rel=1632.9] (requirements may be lower if both abs/rel tolerances are set)
[I] Absolute Difference | Stats: mean=0.0055609, std-dev=0.013131, var=0.00017243, median=0.0012105, min=0 at (0, 0, 6, 319), max=0.34744 at (0, 3, 144, 281), avg-magnitude=0.0055609
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 0.0347) | 198457 | ########################################
(0.0347, 0.0695) | 4728 |
(0.0695, 0.104 ) | 1044 |
(0.104 , 0.139 ) | 351 |
(0.139 , 0.174 ) | 124 |
(0.174 , 0.208 ) | 56 |
(0.208 , 0.243 ) | 23 |
(0.243 , 0.278 ) | 8 |
(0.278 , 0.313 ) | 7 |
(0.313 , 0.347 ) | 2 |
[I] Relative Difference | Stats: mean=0.079962, std-dev=4.2391, var=17.97, median=0.0040368, min=0 at (0, 0, 6, 319), max=1632.9 at (0, 0, 82, 241), avg-magnitude=0.079962
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 163 ) | 204794 | ########################################
(163 , 327 ) | 3 |
(327 , 490 ) | 0 |
(490 , 653 ) | 1 |
(653 , 816 ) | 1 |
(816 , 980 ) | 0 |
(980 , 1.14e+03) | 0 |
(1.14e+03, 1.31e+03) | 0 |
(1.31e+03, 1.47e+03) | 0 |
(1.47e+03, 1.63e+03) | 1 |
[E] FAILED | Output: 'rotc' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[I] Comparing Output: 'rots' (dtype=float32, shape=(1, 4, 160, 320)) with 'rots' (dtype=float32, shape=(1, 4, 160, 320))
[I] Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I] trt-runner-N0-03/28/23-13:32:44: rots | Stats: mean=-0.051449, std-dev=0.20648, var=0.042635, median=-0.1175, min=-0.71027 at (0, 0, 159, 0), max=0.77942 at (0, 3, 97, 318), avg-magnitude=0.18564
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(-0.71 , -0.561) | 25 |
(-0.561, -0.412) | 440 |
(-0.412, -0.263) | 18955 | ########
(-0.263, -0.114) | 84328 | ########################################
(-0.114, 0.0346) | 43205 | ####################
(0.0346, 0.184 ) | 19755 | #########
(0.184 , 0.333 ) | 27342 | ############
(0.333 , 0.481 ) | 8838 | ####
(0.481 , 0.63 ) | 1611 |
(0.63 , 0.779 ) | 301 |
[I] onnxrt-runner-N0-03/28/23-13:32:44: rots | Stats: mean=-0.050523, std-dev=0.20639, var=0.042598, median=-0.11646, min=-0.71027 at (0, 0, 159, 0), max=0.77942 at (0, 3, 97, 318), avg-magnitude=0.18524
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(-0.71 , -0.561) | 27 |
(-0.561, -0.412) | 425 |
(-0.412, -0.263) | 18727 | ########
(-0.263, -0.114) | 84114 | ########################################
(-0.114, 0.0346) | 43274 | ####################
(0.0346, 0.184 ) | 20094 | #########
(0.184 , 0.333 ) | 27318 | ############
(0.333 , 0.481 ) | 8910 | ####
(0.481 , 0.63 ) | 1606 |
(0.63 , 0.779 ) | 305 |
[I] Error Metrics: rots
[I] Minimum Required Tolerance: elemwise error | [abs=0.1114] OR [rel=7694.1] (requirements may be lower if both abs/rel tolerances are set)
[I] Absolute Difference | Stats: mean=0.0045772, std-dev=0.0075856, var=5.7541e-05, median=0.0019067, min=0 at (0, 0, 3, 85), max=0.1114 at (0, 3, 63, 204), avg-magnitude=0.0045772
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 0.0111) | 178777 | ########################################
(0.0111, 0.0223) | 18232 | ####
(0.0223, 0.0334) | 5240 | #
(0.0334, 0.0446) | 1662 |
(0.0446, 0.0557) | 539 |
(0.0557, 0.0668) | 193 |
(0.0668, 0.078 ) | 94 |
(0.078 , 0.0891) | 41 |
(0.0891, 0.1 ) | 15 |
(0.1 , 0.111 ) | 7 |
[I] Relative Difference | Stats: mean=0.22255, std-dev=21.065, var=443.73, median=0.0094848, min=0 at (0, 0, 3, 85), max=7694.1 at (0, 0, 44, 80), avg-magnitude=0.22255
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 769 ) | 204794 | ########################################
(769 , 1.54e+03) | 3 |
(1.54e+03, 2.31e+03) | 0 |
(2.31e+03, 3.08e+03) | 1 |
(3.08e+03, 3.85e+03) | 0 |
(3.85e+03, 4.62e+03) | 1 |
(4.62e+03, 5.39e+03) | 0 |
(5.39e+03, 6.16e+03) | 0 |
(6.16e+03, 6.92e+03) | 0 |
(6.92e+03, 7.69e+03) | 1 |
[E] FAILED | Output: 'rots' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[I] Comparing Output: 'height' (dtype=float32, shape=(1, 4, 160, 320)) with 'height' (dtype=float32, shape=(1, 4, 160, 320))
[I] Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I] trt-runner-N0-03/28/23-13:32:44: height | Stats: mean=2.1949, std-dev=0.5272, var=0.27794, median=2.0496, min=-0.5823 at (0, 3, 64, 319), max=3.5418 at (0, 1, 106, 128), avg-magnitude=2.1949
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(-0.582, -0.17) | 9 |
(-0.17 , 0.243) | 41 |
(0.243 , 0.655) | 198 |
(0.655 , 1.07 ) | 514 |
(1.07 , 1.48 ) | 6482 | ##
(1.48 , 1.89 ) | 53925 | ########################
(1.89 , 2.3 ) | 86509 | ########################################
(2.3 , 2.72 ) | 12163 | #####
(2.72 , 3.13 ) | 28930 | #############
(3.13 , 3.54 ) | 16029 | #######
[I] onnxrt-runner-N0-03/28/23-13:32:44: height | Stats: mean=2.1951, std-dev=0.52698, var=0.27771, median=2.05, min=-0.57648 at (0, 3, 64, 319), max=3.5391 at (0, 1, 104, 128), avg-magnitude=2.1952
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(-0.582, -0.17) | 9 |
(-0.17 , 0.243) | 42 |
(0.243 , 0.655) | 201 |
(0.655 , 1.07 ) | 509 |
(1.07 , 1.48 ) | 6471 | ##
(1.48 , 1.89 ) | 53791 | ########################
(1.89 , 2.3 ) | 86667 | ########################################
(2.3 , 2.72 ) | 12138 | #####
(2.72 , 3.13 ) | 28989 | #############
(3.13 , 3.54 ) | 15983 | #######
[I] Error Metrics: height
[I] Minimum Required Tolerance: elemwise error | [abs=0.12059] OR [rel=2.005] (requirements may be lower if both abs/rel tolerances are set)
[I] Absolute Difference | Stats: mean=0.0062766, std-dev=0.011127, var=0.00012381, median=2.3842e-07, min=0 at (0, 0, 0, 23), max=0.12059 at (0, 2, 120, 35), avg-magnitude=0.0062766
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 0.0121) | 165640 | ########################################
(0.0121, 0.0241) | 23447 | #####
(0.0241, 0.0362) | 9484 | ##
(0.0362, 0.0482) | 3737 |
(0.0482, 0.0603) | 1529 |
(0.0603, 0.0724) | 613 |
(0.0724, 0.0844) | 232 |
(0.0844, 0.0965) | 87 |
(0.0965, 0.109 ) | 22 |
(0.109 , 0.121 ) | 9 |
[I] Relative Difference | Stats: mean=0.0030295, std-dev=0.0074513, var=5.5522e-05, median=1.3162e-07, min=0 at (0, 0, 0, 23), max=2.005 at (0, 3, 81, 319), avg-magnitude=0.0030295
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 0.2 ) | 204797 | ########################################
(0.2 , 0.401) | 1 |
(0.401, 0.601) | 0 |
(0.601, 0.802) | 0 |
(0.802, 1 ) | 1 |
(1 , 1.2 ) | 0 |
(1.2 , 1.4 ) | 0 |
(1.4 , 1.6 ) | 0 |
(1.6 , 1.8 ) | 0 |
(1.8 , 2 ) | 1 |
[E] FAILED | Output: 'height' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[I] Comparing Output: 'dim' (dtype=float32, shape=(1, 12, 160, 320)) with 'dim' (dtype=float32, shape=(1, 12, 160, 320))
[I] Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I] trt-runner-N0-03/28/23-13:32:44: dim | Stats: mean=2.3199, std-dev=1.9537, var=3.817, median=1.6584, min=0.52695 at (0, 10, 109, 2), max=13.937 at (0, 3, 158, 1), avg-magnitude=2.3199
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0.527, 1.87) | 401609 | ########################################
(1.87 , 3.21) | 99100 | #########
(3.21 , 4.55) | 62366 | ######
(4.55 , 5.89) | 130 |
(5.89 , 7.23) | 5833 |
(7.23 , 8.57) | 37408 | ###
(8.57 , 9.91) | 7687 |
(9.91 , 11.3) | 224 |
(11.3 , 12.6) | 41 |
(12.6 , 13.9) | 2 |
[I] onnxrt-runner-N0-03/28/23-13:32:44: dim | Stats: mean=2.32, std-dev=1.9537, var=3.817, median=1.6585, min=0.52695 at (0, 10, 109, 2), max=13.937 at (0, 3, 158, 1), avg-magnitude=2.32
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0.527, 1.87) | 401590 | ########################################
(1.87 , 3.21) | 99117 | #########
(3.21 , 4.55) | 62369 | ######
(4.55 , 5.89) | 129 |
(5.89 , 7.23) | 5851 |
(7.23 , 8.57) | 37400 | ###
(8.57 , 9.91) | 7676 |
(9.91 , 11.3) | 224 |
(11.3 , 12.6) | 42 |
(12.6 , 13.9) | 2 |
[I] Error Metrics: dim
[I] Minimum Required Tolerance: elemwise error | [abs=0.48827] OR [rel=0.067446] (requirements may be lower if both abs/rel tolerances are set)
[I] Absolute Difference | Stats: mean=0.0045923, std-dev=0.015912, var=0.00025319, median=2.3842e-07, min=0 at (0, 0, 0, 0), max=0.48827 at (0, 3, 62, 164), avg-magnitude=0.0045923
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 0.0488) | 602791 | ########################################
(0.0488, 0.0977) | 7191 |
(0.0977, 0.146 ) | 2762 |
(0.146 , 0.195 ) | 1091 |
(0.195 , 0.244 ) | 393 |
(0.244 , 0.293 ) | 114 |
(0.293 , 0.342 ) | 47 |
(0.342 , 0.391 ) | 9 |
(0.391 , 0.439 ) | 1 |
(0.439 , 0.488 ) | 1 |
[I] Relative Difference | Stats: mean=0.0017682, std-dev=0.0040768, var=1.662e-05, median=9.573e-08, min=0 at (0, 0, 0, 0), max=0.067446 at (0, 9, 140, 285), avg-magnitude=0.0017682
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 0.00674) | 561101 | ########################################
(0.00674, 0.0135 ) | 35191 | ##
(0.0135 , 0.0202 ) | 13374 |
(0.0202 , 0.027 ) | 2743 |
(0.027 , 0.0337 ) | 1331 |
(0.0337 , 0.0405 ) | 492 |
(0.0405 , 0.0472 ) | 58 |
(0.0472 , 0.054 ) | 99 |
(0.054 , 0.0607 ) | 8 |
(0.0607 , 0.0674 ) | 3 |
[E] FAILED | Output: 'dim' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[I] Comparing Output: 'reg' (dtype=float32, shape=(1, 8, 160, 320)) with 'reg' (dtype=float32, shape=(1, 8, 160, 320))
[I] Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I] trt-runner-N0-03/28/23-13:32:44: reg | Stats: mean=0.49274, std-dev=0.057861, var=0.0033479, median=0.4946, min=0.088482 at (0, 1, 0, 244), max=1.0677 at (0, 7, 159, 264), avg-magnitude=0.49274
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0.0885, 0.187) | 210 |
(0.187 , 0.285) | 1091 |
(0.285 , 0.384) | 11117 | #
(0.384 , 0.482) | 151945 | ##########################
(0.482 , 0.58 ) | 226910 | ########################################
(0.58 , 0.679) | 16702 | ##
(0.679 , 0.777) | 980 |
(0.777 , 0.876) | 368 |
(0.876 , 0.974) | 218 |
(0.974 , 1.07 ) | 59 |
[I] onnxrt-runner-N0-03/28/23-13:32:44: reg | Stats: mean=0.49275, std-dev=0.057859, var=0.0033477, median=0.49456, min=0.088482 at (0, 1, 0, 244), max=1.0723 at (0, 7, 159, 264), avg-magnitude=0.49275
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0.0885, 0.187) | 209 |
(0.187 , 0.285) | 1087 |
(0.285 , 0.384) | 11103 | #
(0.384 , 0.482) | 151954 | ##########################
(0.482 , 0.58 ) | 226990 | ########################################
(0.58 , 0.679) | 16626 | ##
(0.679 , 0.777) | 986 |
(0.777 , 0.876) | 368 |
(0.876 , 0.974) | 220 |
(0.974 , 1.07 ) | 57 |
[I] Error Metrics: reg
[I] Minimum Required Tolerance: elemwise error | [abs=0.036269] OR [rel=0.09048] (requirements may be lower if both abs/rel tolerances are set)
[I] Absolute Difference | Stats: mean=0.0020523, std-dev=0.0033401, var=1.1156e-05, median=5.9605e-08, min=0 at (0, 0, 0, 22), max=0.036269 at (0, 0, 124, 151), avg-magnitude=0.0020523
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 0.00363) | 319152 | ########################################
(0.00363, 0.00725) | 54361 | ######
(0.00725, 0.0109 ) | 23600 | ##
(0.0109 , 0.0145 ) | 8633 | #
(0.0145 , 0.0181 ) | 2807 |
(0.0181 , 0.0218 ) | 775 |
(0.0218 , 0.0254 ) | 206 |
(0.0254 , 0.029 ) | 47 |
(0.029 , 0.0326 ) | 15 |
(0.0326 , 0.0363 ) | 4 |
[I] Relative Difference | Stats: mean=0.0042154, std-dev=0.0069295, var=4.8017e-05, median=1.168e-07, min=0 at (0, 0, 0, 22), max=0.09048 at (0, 7, 0, 12), avg-magnitude=0.0042154
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 0.00905) | 334913 | ########################################
(0.00905, 0.0181 ) | 51390 | ######
(0.0181 , 0.0271 ) | 17074 | ##
(0.0271 , 0.0362 ) | 4737 |
(0.0362 , 0.0452 ) | 1123 |
(0.0452 , 0.0543 ) | 268 |
(0.0543 , 0.0633 ) | 70 |
(0.0633 , 0.0724 ) | 15 |
(0.0724 , 0.0814 ) | 7 |
(0.0814 , 0.0905 ) | 3 |
[E] FAILED | Output: 'reg' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[I] Comparing Output: 'heatmap' (dtype=float32, shape=(1, 4, 160, 320)) with 'heatmap' (dtype=float32, shape=(1, 4, 160, 320))
[I] Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I] trt-runner-N0-03/28/23-13:32:44: heatmap | Stats: mean=-6.8228, std-dev=0.96949, var=0.93991, median=-7.1304, min=-9.5214 at (0, 1, 43, 6), max=-3.2045 at (0, 0, 159, 319), avg-magnitude=6.8228
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(-9.52, -8.89) | 208 |
(-8.89, -8.26) | 4153 | ##
(-8.26, -7.63) | 33442 | ################
(-7.63, -6.99) | 80859 | ########################################
(-6.99, -6.36) | 31314 | ###############
(-6.36, -5.73) | 4345 | ##
(-5.73, -5.1 ) | 42928 | #####################
(-5.1 , -4.47) | 7232 | ###
(-4.47, -3.84) | 309 |
(-3.84, -3.2 ) | 10 |
[I] onnxrt-runner-N0-03/28/23-13:32:44: heatmap | Stats: mean=-6.8223, std-dev=0.96948, var=0.93989, median=-7.1301, min=-9.5214 at (0, 1, 43, 6), max=-3.2045 at (0, 0, 159, 319), avg-magnitude=6.8223
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(-9.52, -8.89) | 219 |
(-8.89, -8.26) | 4135 | ##
(-8.26, -7.63) | 33401 | ################
(-7.63, -6.99) | 80904 | ########################################
(-6.99, -6.36) | 31309 | ###############
(-6.36, -5.73) | 4355 | ##
(-5.73, -5.1 ) | 42923 | #####################
(-5.1 , -4.47) | 7236 | ###
(-4.47, -3.84) | 308 |
(-3.84, -3.2 ) | 10 |
[I] Error Metrics: heatmap
[I] Minimum Required Tolerance: elemwise error | [abs=0.27722] OR [rel=0.035072] (requirements may be lower if both abs/rel tolerances are set)
[I] Absolute Difference | Stats: mean=0.01111, std-dev=0.019301, var=0.00037253, median=1.9073e-06, min=0 at (0, 0, 0, 25), max=0.27722 at (0, 1, 48, 93), avg-magnitude=0.01111
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 0.0277) | 174344 | ########################################
(0.0277, 0.0554) | 21820 | #####
(0.0554, 0.0832) | 6306 | #
(0.0832, 0.111 ) | 1674 |
(0.111 , 0.139 ) | 487 |
(0.139 , 0.166 ) | 118 |
(0.166 , 0.194 ) | 38 |
(0.194 , 0.222 ) | 6 |
(0.222 , 0.249 ) | 5 |
(0.249 , 0.277 ) | 2 |
[I] Relative Difference | Stats: mean=0.001635, std-dev=0.002817, var=7.9356e-06, median=2.7015e-07, min=0 at (0, 0, 0, 25), max=0.035072 at (0, 1, 48, 93), avg-magnitude=0.001635
[I] ---- Histogram ----
Bin Range | Num Elems | Visualization
(0 , 0.00351) | 168600 | ########################################
(0.00351, 0.00701) | 23784 | #####
(0.00701, 0.0105 ) | 8446 | ##
(0.0105 , 0.014 ) | 2729 |
(0.014 , 0.0175 ) | 857 |
(0.0175 , 0.021 ) | 277 |
(0.021 , 0.0246 ) | 82 |
(0.0246 , 0.0281 ) | 16 |
(0.0281 , 0.0316 ) | 6 |
(0.0316 , 0.0351 ) | 3 |
[E] FAILED | Output: 'heatmap' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[E] FAILED | Mismatched outputs: ['rotc', 'rots', 'height', 'dim', 'reg', 'heatmap']
[E] Accuracy Summary | trt-runner-N0-03/28/23-13:32:44 vs. onnxrt-runner-N0-03/28/23-13:32:44 | Passed: 0/1 iterations | Pass Rate: 0.0%
[E] FAILED | Runtime: 54.954s | Command: /usr/local/bin/polygraphy run CenterPointRPN.onnx --trt --int8 --onnxrt
But I can see the model fail at TRT 8.6. will take a further check.
[03/28/2023-13:38:57] [V] [TRT] ConstWeightsFusion: Fusing pts_bbox_head.task_heads.3.heatmap.0.conv.weight + QuantizeLinear_860 with Conv_862 + Relu_863
[03/28/2023-13:38:57] [V] [TRT] Running: ConstWeightsFusion on pts_bbox_head.task_heads.3.heatmap.1.weight + QuantizeLinear_874
[03/28/2023-13:38:57] [V] [TRT] ConstWeightsFusion: Fusing pts_bbox_head.task_heads.3.heatmap.1.weight + QuantizeLinear_874 with Conv_876
[03/28/2023-13:38:57] [V] [TRT] After dupe layer removal: 73 layers
[03/28/2023-13:38:57] [V] [TRT] After final dead-layer removal: 73 layers
[03/28/2023-13:38:57] [E] Error[2]: [standardBuilderUtils.cpp::canStride::90] Error Code 2: Internal Error (Assertion !layerImpls.empty() failed. Exp_885 has no RunnerBuilders)
[03/28/2023-13:38:57] [E] Engine could not be created from network
[03/28/2023-13:38:57] [E] Building engine failed
[03/28/2023-13:38:57] [E] Failed to create engine from model or file.
[03/28/2023-13:38:57] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8610] # trtexec --onnx=CenterPointRPN.onnx --int8 --verbose
Update: filed internal bug 4048138 for this.
cc @ttyio
I uploaded some files which may help you to reproduce this issue. Please check. @zerollzeng https://drive.google.com/file/d/1xgDGNzcPLk76uyfI-OViZyxpwdrQhSAT/view?usp=share_link
I found that this problem wont happen in tensorrt 8.5. I think something might be fixed in trt8.5. @zerollzeng
Glad to know that :-)
Description
I used the pytorch_quantization toolkit to convert the Centerpoint RPN part in mmdetection3d in order to get int8 model, and I was able to successfully export the pytorch model to onnx. Also it can be converted to trt model successfully, however I found the result was different from pytorch result.
When the onnx was converted to the trt, the model was like
I found that some layers looks like pts_bbox_head.task_heads.2.height.0.conv.weight + QuantizeLinear_657 + Conv_659 + Relu_660 || pts_bbox_head.task_heads.2.reg.0.conv.weight + QuantizeLinear_632 + Conv_634 + Relu_635 || pts_bbox_head.task_heads.1.heatmap.0.conv.weight + QuantizeLinear_606 + Conv_608 + Relu_609 || pts_bbox_head.task_heads.1.rot.0.conv.weight + QuantizeLinear_581 + Conv_583 + Relu_584 || pts_bbox_head.task_heads.1.dim.0.conv.weight + QuantizeLinear_556 + Conv_558 + Relu_559 || pts_bbox_head.task_heads.1.height.0.conv.weight + QuantizeLinear_530 + Conv_532 + Relu_533 || pts_bbox_head.task_heads.1.reg.0.conv.weight + QuantizeLinear_505 + Conv_507 + Relu_508 || pts_bbox_head.task_heads.0.heatmap.0.conv.weight + QuantizeLinear_479 + Conv_481 + Relu_482 , If these layers dont fuse, the output should be right, I did this by marking as output for some layers to avoid layer fusion operation. Just wonder why would these happen and how could I get the right output even when these layers are fused.
Environment
TensorRT Version: 8.4.1.5 NVIDIA GPU: V100 NVIDIA Driver Version: 470.129.06 CUDA Version: 11.4 CUDNN Version: cuDNN 8.3.2 Operating System: Ubuntu 18.04.6 LTS (GNU/Linux 4.15.0-190-generic x86_64) Python Version (if applicable): 3.8.13 PyTorch Version (if applicable): 1.12.0
Relevant Files
https://drive.google.com/file/d/1PtibrN7sQPTFBRyaYf0M4kTX8SHtlt8d/view?usp=share_link