NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.55k stars 2.1k forks source link

Inconsistent Output During conversion QAT model to TRT #2823

Closed AdanWang closed 1 year ago

AdanWang commented 1 year ago

Description

I used the pytorch_quantization toolkit to convert the Centerpoint RPN part in mmdetection3d in order to get int8 model, and I was able to successfully export the pytorch model to onnx. Also it can be converted to trt model successfully, however I found the result was different from pytorch result.

When the onnx was converted to the trt, the model was like

[ { "count" : 237 } , { "name" : "QuantizeLinear_111", "timeMs" : 22.7676, "averageMs" : 0.0960659, "medianMs" : 0.096256, "percentage" : 1.20287 } , { "name" : "pts_backbone.blocks.0.0.weight + QuantizeLinear_116 + Conv_118 + Relu_119", "timeMs" : 67.5881, "averageMs" : 0.285182, "medianMs" : 0.283648, "percentage" : 3.57083 } , { "name" : "pts_backbone.blocks.0.3.weight + QuantizeLinear_129 + Conv_131 + Relu_132", "timeMs" : 66.5252, "averageMs" : 0.280697, "medianMs" : 0.278528, "percentage" : 3.51468 } , { "name" : "pts_backbone.blocks.0.6.weight + QuantizeLinear_142 + Conv_144 + Relu_145", "timeMs" : 66.477, "averageMs" : 0.280494, "medianMs" : 0.278528, "percentage" : 3.51213 } , { "name" : "pts_backbone.blocks.0.9.weight + QuantizeLinear_155 + Conv_157 + Relu_158", "timeMs" : 66.5651, "averageMs" : 0.280865, "medianMs" : 0.278528, "percentage" : 3.51679 } , { "name" : "pts_neck.deblocks.0.0.weight + QuantizeLinear_324 + Conv_326 + Relu_327", "timeMs" : 19.0403, "averageMs" : 0.0803386, "medianMs" : 0.079872, "percentage" : 1.00594 } , { "name" : "pts_backbone.blocks.1.0.weight + QuantizeLinear_168 + Conv_170 + Relu_171", "timeMs" : 35.3137, "averageMs" : 0.149003, "medianMs" : 0.146432, "percentage" : 1.8657 } , { "name" : "pts_backbone.blocks.1.3.weight + QuantizeLinear_181 + Conv_183 + Relu_184", "timeMs" : 64.6328, "averageMs" : 0.272712, "medianMs" : 0.263168, "percentage" : 3.4147 } , { "name" : "pts_backbone.blocks.1.6.weight + QuantizeLinear_194 + Conv_196 + Relu_197", "timeMs" : 64.4413, "averageMs" : 0.271904, "medianMs" : 0.262144, "percentage" : 3.40458 } , { "name" : "pts_backbone.blocks.1.9.weight + QuantizeLinear_207 + Conv_209 + Relu_210", "timeMs" : 64.1341, "averageMs" : 0.270608, "medianMs" : 0.263168, "percentage" : 3.38835 } , { "name" : "pts_backbone.blocks.1.12.weight + QuantizeLinear_220 + Conv_222 + Relu_223", "timeMs" : 64.727, "averageMs" : 0.27311, "medianMs" : 0.263168, "percentage" : 3.41968 } , { "name" : "pts_backbone.blocks.1.15.weight + QuantizeLinear_233 + Conv_235 + Relu_236", "timeMs" : 64.471, "averageMs" : 0.27203, "medianMs" : 0.263168, "percentage" : 3.40615 } , { "name" : "pts_neck.deblocks.1.0.weight + QuantizeLinear_337 + Conv_339 + Relu_340", "timeMs" : 12.9987, "averageMs" : 0.0548467, "medianMs" : 0.055296, "percentage" : 0.686749 } , { "name" : "pts_backbone.blocks.2.0.weight + QuantizeLinear_246 + Conv_248 + Relu_249", "timeMs" : 34.9266, "averageMs" : 0.14737, "medianMs" : 0.147456, "percentage" : 1.84525 } , { "name" : "pts_backbone.blocks.2.3.weight + QuantizeLinear_259 + Conv_261 + Relu_262", "timeMs" : 65.9128, "averageMs" : 0.278113, "medianMs" : 0.272384, "percentage" : 3.48232 } , { "name" : "pts_backbone.blocks.2.6.weight + QuantizeLinear_272 + Conv_274 + Relu_275", "timeMs" : 65.4193, "averageMs" : 0.276031, "medianMs" : 0.27136, "percentage" : 3.45625 } , { "name" : "pts_backbone.blocks.2.9.weight + QuantizeLinear_285 + Conv_287 + Relu_288", "timeMs" : 65.2493, "averageMs" : 0.275313, "medianMs" : 0.272384, "percentage" : 3.44727 } , { "name" : "pts_backbone.blocks.2.12.weight + QuantizeLinear_298 + Conv_300 + Relu_301", "timeMs" : 65.4377, "averageMs" : 0.276108, "medianMs" : 0.27136, "percentage" : 3.45722 } , { "name" : "pts_backbone.blocks.2.15.weight + QuantizeLinear_311 + Conv_313 + Relu_314", "timeMs" : 65.5892, "averageMs" : 0.276748, "medianMs" : 0.27136, "percentage" : 3.46523 } , { "name" : "pts_neck.deblocks.2.0.weight + QuantizeLinear_350 + ConvTranspose_352", "timeMs" : 111.414, "averageMs" : 0.470102, "medianMs" : 0.466944, "percentage" : 5.88627 } , { "name" : "BatchNormalization_353 + Relu_354", "timeMs" : 18.1258, "averageMs" : 0.0764802, "medianMs" : 0.0768, "percentage" : 0.957628 } , { "name" : "QuantizeLinear_360_clone_2", "timeMs" : 12.6894, "averageMs" : 0.0535418, "medianMs" : 0.054272, "percentage" : 0.67041 } , { "name" : "pts_bbox_head.shared_conv.conv.weight + QuantizeLinear_365 + Conv_367 + Relu_368", "timeMs" : 97.4428, "averageMs" : 0.411151, "medianMs" : 0.398336, "percentage" : 5.14813 } , { "name" : "pts_bbox_head.task_heads.3.heatmap.0.conv.weight + QuantizeLinear_860 + Conv_862 + Relu_863 || pts_bbox_head.task_heads.3.rot.0.conv.weight + QuantizeLinear_835 + Conv_837 + Relu_838 || pts_bbox_head.task_heads.3.dim.0.conv.weight + QuantizeLinear_810 + Conv_812 + Relu_813 || pts_bbox_head.task_heads.3.height.0.conv.weight + QuantizeLinear_784 + Conv_786 + Relu_787 || pts_bbox_head.task_heads.3.reg.0.conv.weight + QuantizeLinear_759 + Conv_761 + Relu_762 || pts_bbox_head.task_heads.2.heatmap.0.conv.weight + QuantizeLinear_733 + Conv_735 + Relu_736 || pts_bbox_head.task_heads.2.rot.0.conv.weight + QuantizeLinear_708 + Conv_710 + Relu_711 || pts_bbox_head.task_heads.2.dim.0.conv.weight + QuantizeLinear_683 + Conv_685 + Relu_686", "timeMs" : 128.444, "averageMs" : 0.541959, "medianMs" : 0.546816, "percentage" : 6.78601 } , { "name" : "pts_bbox_head.task_heads.2.height.0.conv.weight + QuantizeLinear_657 + Conv_659 + Relu_660 || pts_bbox_head.task_heads.2.reg.0.conv.weight + QuantizeLinear_632 + Conv_634 + Relu_635 || pts_bbox_head.task_heads.1.heatmap.0.conv.weight + QuantizeLinear_606 + Conv_608 + Relu_609 || pts_bbox_head.task_heads.1.rot.0.conv.weight + QuantizeLinear_581 + Conv_583 + Relu_584 || pts_bbox_head.task_heads.1.dim.0.conv.weight + QuantizeLinear_556 + Conv_558 + Relu_559 || pts_bbox_head.task_heads.1.height.0.conv.weight + QuantizeLinear_530 + Conv_532 + Relu_533 || pts_bbox_head.task_heads.1.reg.0.conv.weight + QuantizeLinear_505 + Conv_507 + Relu_508 || pts_bbox_head.task_heads.0.heatmap.0.conv.weight + QuantizeLinear_479 + Conv_481 + Relu_482", "timeMs" : 127.317, "averageMs" : 0.537202, "medianMs" : 0.541696, "percentage" : 6.72645 } , { "name" : "pts_bbox_head.task_heads.0.rot.0.conv.weight + QuantizeLinear_454 + Conv_456 + Relu_457 || pts_bbox_head.task_heads.0.dim.0.conv.weight + QuantizeLinear_429 + Conv_431 + Relu_432 || pts_bbox_head.task_heads.0.height.0.conv.weight + QuantizeLinear_403 + Conv_405 + Relu_406 || pts_bbox_head.task_heads.0.reg.0.conv.weight + QuantizeLinear_378 + Conv_380 + Relu_381", "timeMs" : 65.6302, "averageMs" : 0.276921, "medianMs" : 0.275456, "percentage" : 3.46739 } , { "name" : "pts_bbox_head.task_heads.0.reg.1.weight + QuantizeLinear_391 + Conv_393", "timeMs" : 14.6371, "averageMs" : 0.0617598, "medianMs" : 0.06144, "percentage" : 0.77331 } , { "name" : "pts_bbox_head.task_heads.0.height.1.weight + QuantizeLinear_417 + Conv_419", "timeMs" : 13.7933, "averageMs" : 0.0581995, "medianMs" : 0.058368, "percentage" : 0.728731 } , { "name" : "pts_bbox_head.task_heads.0.dim.1.weight + QuantizeLinear_442 + Conv_444", "timeMs" : 13.7503, "averageMs" : 0.0580181, "medianMs" : 0.058368, "percentage" : 0.726459 } , { "name" : "PWN(Exp_885)", "timeMs" : 2.48934, "averageMs" : 0.0105036, "medianMs" : 0.01024, "percentage" : 0.131518 } , { "name" : "pts_bbox_head.task_heads.0.rot.1.weight + QuantizeLinear_467 + Conv_469", "timeMs" : 13.7994, "averageMs" : 0.0582254, "medianMs" : 0.058368, "percentage" : 0.729055 } , { "name" : "pts_bbox_head.task_heads.0.heatmap.1.weight + QuantizeLinear_493 + Conv_495", "timeMs" : 13.7411, "averageMs" : 0.0579792, "medianMs" : 0.058368, "percentage" : 0.725972 } , { "name" : "pts_bbox_head.task_heads.1.reg.1.weight + QuantizeLinear_518 + Conv_520", "timeMs" : 13.7155, "averageMs" : 0.0578712, "medianMs" : 0.058368, "percentage" : 0.72462 } , { "name" : "pts_bbox_head.task_heads.1.height.1.weight + QuantizeLinear_544 + Conv_546", "timeMs" : 13.7134, "averageMs" : 0.0578625, "medianMs" : 0.058368, "percentage" : 0.724511 } , { "name" : "pts_bbox_head.task_heads.1.dim.1.weight + QuantizeLinear_569 + Conv_571", "timeMs" : 13.6725, "averageMs" : 0.0576897, "medianMs" : 0.057344, "percentage" : 0.722347 } , { "name" : "PWN(Exp_894)", "timeMs" : 2.52928, "averageMs" : 0.0106721, "medianMs" : 0.01024, "percentage" : 0.133628 } , { "name" : "pts_bbox_head.task_heads.1.rot.1.weight + QuantizeLinear_594 + Conv_596", "timeMs" : 13.7114, "averageMs" : 0.0578539, "medianMs" : 0.058368, "percentage" : 0.724403 } , { "name" : "pts_bbox_head.task_heads.1.heatmap.1.weight + QuantizeLinear_620 + Conv_622", "timeMs" : 13.6028, "averageMs" : 0.0573959, "medianMs" : 0.057344, "percentage" : 0.718669 } , { "name" : "pts_bbox_head.task_heads.2.reg.1.weight + QuantizeLinear_645 + Conv_647", "timeMs" : 13.5793, "averageMs" : 0.0572966, "medianMs" : 0.057344, "percentage" : 0.717425 } , { "name" : "pts_bbox_head.task_heads.2.height.1.weight + QuantizeLinear_671 + Conv_673", "timeMs" : 13.5475, "averageMs" : 0.0571626, "medianMs" : 0.057344, "percentage" : 0.715748 } , { "name" : "pts_bbox_head.task_heads.2.dim.1.weight + QuantizeLinear_696 + Conv_698", "timeMs" : 13.5537, "averageMs" : 0.0571886, "medianMs" : 0.057344, "percentage" : 0.716072 } , { "name" : "PWN(Exp_903)", "timeMs" : 2.22003, "averageMs" : 0.00936721, "medianMs" : 0.009216, "percentage" : 0.117289 } , { "name" : "pts_bbox_head.task_heads.2.rot.1.weight + QuantizeLinear_721 + Conv_723", "timeMs" : 13.4554, "averageMs" : 0.0567738, "medianMs" : 0.05632, "percentage" : 0.710879 } , { "name" : "pts_bbox_head.task_heads.2.heatmap.1.weight + QuantizeLinear_747 + Conv_749", "timeMs" : 13.3561, "averageMs" : 0.0563547, "medianMs" : 0.05632, "percentage" : 0.705631 } , { "name" : "pts_bbox_head.task_heads.3.reg.1.weight + QuantizeLinear_772 + Conv_774", "timeMs" : 13.3161, "averageMs" : 0.0561861, "medianMs" : 0.05632, "percentage" : 0.703521 } , { "name" : "pts_bbox_head.task_heads.3.height.1.weight + QuantizeLinear_798 + Conv_800", "timeMs" : 13.2557, "averageMs" : 0.0559312, "medianMs" : 0.05632, "percentage" : 0.700328 } , { "name" : "pts_bbox_head.task_heads.3.dim.1.weight + QuantizeLinear_823 + Conv_825", "timeMs" : 13.1901, "averageMs" : 0.0556546, "medianMs" : 0.055296, "percentage" : 0.696866 } , { "name" : "PWN(Exp_912)", "timeMs" : 2.06745, "averageMs" : 0.00872344, "medianMs" : 0.009216, "percentage" : 0.109228 } , { "name" : "pts_bbox_head.task_heads.3.rot.1.weight + QuantizeLinear_848 + Conv_850", "timeMs" : 13.1676, "averageMs" : 0.0555596, "medianMs" : 0.055296, "percentage" : 0.695675 } , { "name" : "pts_bbox_head.task_heads.3.heatmap.1.weight + QuantizeLinear_874 + Conv_876", "timeMs" : 13.1205, "averageMs" : 0.0553608, "medianMs" : 0.055296, "percentage" : 0.693186 } , { "name" : "{ForeignNode[onnx::Gather_1010...Concat_913]}", "timeMs" : 8.5166, "averageMs" : 0.035935, "medianMs" : 0.03584, "percentage" : 0.449952 } ]

I found that some layers looks like pts_bbox_head.task_heads.2.height.0.conv.weight + QuantizeLinear_657 + Conv_659 + Relu_660 || pts_bbox_head.task_heads.2.reg.0.conv.weight + QuantizeLinear_632 + Conv_634 + Relu_635 || pts_bbox_head.task_heads.1.heatmap.0.conv.weight + QuantizeLinear_606 + Conv_608 + Relu_609 || pts_bbox_head.task_heads.1.rot.0.conv.weight + QuantizeLinear_581 + Conv_583 + Relu_584 || pts_bbox_head.task_heads.1.dim.0.conv.weight + QuantizeLinear_556 + Conv_558 + Relu_559 || pts_bbox_head.task_heads.1.height.0.conv.weight + QuantizeLinear_530 + Conv_532 + Relu_533 || pts_bbox_head.task_heads.1.reg.0.conv.weight + QuantizeLinear_505 + Conv_507 + Relu_508 || pts_bbox_head.task_heads.0.heatmap.0.conv.weight + QuantizeLinear_479 + Conv_481 + Relu_482 , If these layers dont fuse, the output should be right, I did this by marking as output for some layers to avoid layer fusion operation. Just wonder why would these happen and how could I get the right output even when these layers are fused.

Environment

TensorRT Version: 8.4.1.5 NVIDIA GPU: V100 NVIDIA Driver Version: 470.129.06 CUDA Version: 11.4 CUDNN Version: cuDNN 8.3.2 Operating System: Ubuntu 18.04.6 LTS (GNU/Linux 4.15.0-190-generic x86_64) Python Version (if applicable): 3.8.13 PyTorch Version (if applicable): 1.12.0

Relevant Files

https://drive.google.com/file/d/1PtibrN7sQPTFBRyaYf0M4kTX8SHtlt8d/view?usp=share_link

AdanWang commented 1 year ago

[ { "count" : 1329 } , { "name" : "QuantizeLinear_111", "timeMs" : 115.151, "averageMs" : 0.0866452, "medianMs" : 0.08704, "percentage" : 1.09893 } , { "name" : "pts_backbone.blocks.0.0.weight + QuantizeLinear_116 + Conv_118 + Relu_119", "timeMs" : 364.868, "averageMs" : 0.274543, "medianMs" : 0.270336, "percentage" : 3.48206 } , { "name" : "pts_backbone.blocks.0.3.weight + QuantizeLinear_129 + Conv_131 + Relu_132", "timeMs" : 364.141, "averageMs" : 0.273996, "medianMs" : 0.269312, "percentage" : 3.47513 } , { "name" : "pts_backbone.blocks.0.6.weight + QuantizeLinear_142 + Conv_144 + Relu_145", "timeMs" : 364.155, "averageMs" : 0.274007, "medianMs" : 0.270336, "percentage" : 3.47526 } , { "name" : "pts_backbone.blocks.0.9.weight + QuantizeLinear_155 + Conv_157 + Relu_158", "timeMs" : 363.725, "averageMs" : 0.273683, "medianMs" : 0.269312, "percentage" : 3.47116 } , { "name" : "pts_neck.deblocks.0.0.weight + QuantizeLinear_324 + Conv_326 + Relu_327", "timeMs" : 102.013, "averageMs" : 0.0767592, "medianMs" : 0.075776, "percentage" : 0.973546 } , { "name" : "pts_backbone.blocks.1.0.weight + QuantizeLinear_168 + Conv_170 + Relu_171", "timeMs" : 195.417, "averageMs" : 0.147041, "medianMs" : 0.142336, "percentage" : 1.86493 } , { "name" : "pts_backbone.blocks.1.3.weight + QuantizeLinear_181 + Conv_183 + Relu_184", "timeMs" : 355.65, "averageMs" : 0.267607, "medianMs" : 0.26624, "percentage" : 3.39409 } , { "name" : "pts_backbone.blocks.1.6.weight + QuantizeLinear_194 + Conv_196 + Relu_197", "timeMs" : 354.579, "averageMs" : 0.266801, "medianMs" : 0.265216, "percentage" : 3.38387 } , { "name" : "pts_backbone.blocks.1.9.weight + QuantizeLinear_207 + Conv_209 + Relu_210", "timeMs" : 354.532, "averageMs" : 0.266766, "medianMs" : 0.265216, "percentage" : 3.38342 } , { "name" : "pts_backbone.blocks.1.12.weight + QuantizeLinear_220 + Conv_222 + Relu_223", "timeMs" : 354.55, "averageMs" : 0.26678, "medianMs" : 0.265216, "percentage" : 3.3836 } , { "name" : "pts_backbone.blocks.1.15.weight + QuantizeLinear_233 + Conv_235 + Relu_236", "timeMs" : 355.573, "averageMs" : 0.267549, "medianMs" : 0.26624, "percentage" : 3.39336 } , { "name" : "pts_neck.deblocks.1.0.weight + QuantizeLinear_337 + Conv_339 + Relu_340", "timeMs" : 66.1585, "averageMs" : 0.0497806, "medianMs" : 0.049152, "percentage" : 0.631374 } , { "name" : "pts_backbone.blocks.2.0.weight + QuantizeLinear_246 + Conv_248 + Relu_249", "timeMs" : 186.386, "averageMs" : 0.140245, "medianMs" : 0.13824, "percentage" : 1.77874 } , { "name" : "pts_backbone.blocks.2.3.weight + QuantizeLinear_259 + Conv_261 + Relu_262", "timeMs" : 362.567, "averageMs" : 0.272812, "medianMs" : 0.26624, "percentage" : 3.46011 } , { "name" : "pts_backbone.blocks.2.6.weight + QuantizeLinear_272 + Conv_274 + Relu_275", "timeMs" : 366.116, "averageMs" : 0.275482, "medianMs" : 0.267264, "percentage" : 3.49398 } , { "name" : "pts_backbone.blocks.2.9.weight + QuantizeLinear_285 + Conv_287 + Relu_288", "timeMs" : 362.146, "averageMs" : 0.272495, "medianMs" : 0.26624, "percentage" : 3.45609 } , { "name" : "pts_backbone.blocks.2.12.weight + QuantizeLinear_298 + Conv_300 + Relu_301", "timeMs" : 364.159, "averageMs" : 0.274009, "medianMs" : 0.26624, "percentage" : 3.4753 } , { "name" : "pts_backbone.blocks.2.15.weight + QuantizeLinear_311 + Conv_313 + Relu_314", "timeMs" : 361.934, "averageMs" : 0.272336, "medianMs" : 0.26624, "percentage" : 3.45407 } , { "name" : "pts_neck.deblocks.2.0.weight + QuantizeLinear_350 + ConvTranspose_352", "timeMs" : 624.911, "averageMs" : 0.470211, "medianMs" : 0.468992, "percentage" : 5.96375 } , { "name" : "BatchNormalization_353 + Relu_354", "timeMs" : 94.4735, "averageMs" : 0.0710862, "medianMs" : 0.070656, "percentage" : 0.901595 } , { "name" : "QuantizeLinear_360_clone_2", "timeMs" : 64.5658, "averageMs" : 0.0485823, "medianMs" : 0.048128, "percentage" : 0.616175 } , { "name" : "pts_bbox_head.shared_conv.conv.weight + QuantizeLinear_365 + Conv_367 + Relu_368", "timeMs" : 540.033, "averageMs" : 0.406345, "medianMs" : 0.39424, "percentage" : 5.15373 } , { "name" : "pts_bbox_head.task_heads.3.heatmap.0.conv.weight + QuantizeLinear_860 + Conv_862 + Relu_863 || pts_bbox_head.task_heads.3.rot.0.conv.weight + QuantizeLinear_835 + Conv_837 + Relu_838 || pts_bbox_head.task_heads.3.dim.0.conv.weight + QuantizeLinear_810 + Conv_812 + Relu_813 || pts_bbox_head.task_heads.3.height.0.conv.weight + QuantizeLinear_784 + Conv_786 + Relu_787 || pts_bbox_head.task_heads.3.reg.0.conv.weight + QuantizeLinear_759 + Conv_761 + Relu_762 || pts_bbox_head.task_heads.2.heatmap.0.conv.weight + QuantizeLinear_733 + Conv_735 + Relu_736 || pts_bbox_head.task_heads.2.rot.0.conv.weight + QuantizeLinear_708 + Conv_710 + Relu_711 || pts_bbox_head.task_heads.2.dim.0.conv.weight + QuantizeLinear_683 + Conv_685 + Relu_686", "timeMs" : 718.96, "averageMs" : 0.540978, "medianMs" : 0.543744, "percentage" : 6.8613 } , { "name" : "pts_bbox_head.task_heads.2.height.0.conv.weight + QuantizeLinear_657 + Conv_659 + Relu_660 || pts_bbox_head.task_heads.2.reg.0.conv.weight + QuantizeLinear_632 + Conv_634 + Relu_635 || pts_bbox_head.task_heads.1.heatmap.0.conv.weight + QuantizeLinear_606 + Conv_608 + Relu_609 || pts_bbox_head.task_heads.1.rot.0.conv.weight + QuantizeLinear_581 + Conv_583 + Relu_584 || pts_bbox_head.task_heads.1.dim.0.conv.weight + QuantizeLinear_556 + Conv_558 + Relu_559 || pts_bbox_head.task_heads.1.height.0.conv.weight + QuantizeLinear_530 + Conv_532 + Relu_533 || pts_bbox_head.task_heads.1.reg.0.conv.weight + QuantizeLinear_505 + Conv_507 + Relu_508 || pts_bbox_head.task_heads.0.heatmap.0.conv.weight + QuantizeLinear_479 + Conv_481 + Relu_482", "timeMs" : 712.105, "averageMs" : 0.53582, "medianMs" : 0.538624, "percentage" : 6.79587 } , { "name" : "pts_bbox_head.task_heads.0.rot.0.conv.weight + QuantizeLinear_454 + Conv_456 + Relu_457 || pts_bbox_head.task_heads.0.dim.0.conv.weight + QuantizeLinear_429 + Conv_431 + Relu_432 || pts_bbox_head.task_heads.0.reg.0.conv.weight + QuantizeLinear_378 + Conv_380 + Relu_381", "timeMs" : 276.612, "averageMs" : 0.208135, "medianMs" : 0.205824, "percentage" : 2.6398 } , { "name" : "pts_bbox_head.task_heads.0.height.0.conv.weight + QuantizeLinear_403 + Conv_405 + Relu_406", "timeMs" : 101.168, "averageMs" : 0.0761233, "medianMs" : 0.075776, "percentage" : 0.965482 } , { "name" : "Reformatting CopyNode for Output Tensor 0 to pts_bbox_head.task_heads.0.height.0.conv.weight + QuantizeLinear_403 + Conv_405 + Relu_406", "timeMs" : 55.6095, "averageMs" : 0.0418431, "medianMs" : 0.041984, "percentage" : 0.530701 } , { "name" : "pts_bbox_head.task_heads.0.reg.1.weight + QuantizeLinear_391 + Conv_393", "timeMs" : 78.5317, "averageMs" : 0.0590908, "medianMs" : 0.059392, "percentage" : 0.749456 } , { "name" : "Reformatting CopyNode for Input Tensor 0 to pts_bbox_head.task_heads.0.height.1.weight + QuantizeLinear_417 + Conv_419", "timeMs" : 35.9069, "averageMs" : 0.027018, "medianMs" : 0.026624, "percentage" : 0.342672 } , { "name" : "pts_bbox_head.task_heads.0.height.1.weight + QuantizeLinear_417 + Conv_419", "timeMs" : 72.8603, "averageMs" : 0.0548234, "medianMs" : 0.054272, "percentage" : 0.695332 } , { "name" : "pts_bbox_head.task_heads.0.dim.1.weight + QuantizeLinear_442 + Conv_444", "timeMs" : 73.5907, "averageMs" : 0.055373, "medianMs" : 0.055296, "percentage" : 0.702302 } , { "name" : "PWN(Exp_885)", "timeMs" : 8.46949, "averageMs" : 0.00637283, "medianMs" : 0.006144, "percentage" : 0.0808274 } , { "name" : "pts_bbox_head.task_heads.0.rot.1.weight + QuantizeLinear_467 + Conv_469", "timeMs" : 75.0195, "averageMs" : 0.0564481, "medianMs" : 0.05632, "percentage" : 0.715938 } , { "name" : "pts_bbox_head.task_heads.0.heatmap.1.weight + QuantizeLinear_493 + Conv_495", "timeMs" : 72.8726, "averageMs" : 0.0548326, "medianMs" : 0.054272, "percentage" : 0.695449 } , { "name" : "pts_bbox_head.task_heads.1.reg.1.weight + QuantizeLinear_518 + Conv_520", "timeMs" : 73.4473, "averageMs" : 0.0552651, "medianMs" : 0.055296, "percentage" : 0.700934 } , { "name" : "pts_bbox_head.task_heads.1.height.1.weight + QuantizeLinear_544 + Conv_546", "timeMs" : 72.9791, "averageMs" : 0.0549128, "medianMs" : 0.054272, "percentage" : 0.696466 } , { "name" : "pts_bbox_head.task_heads.1.dim.1.weight + QuantizeLinear_569 + Conv_571", "timeMs" : 74.3974, "averageMs" : 0.05598, "medianMs" : 0.055296, "percentage" : 0.710001 } , { "name" : "PWN(Exp_894)", "timeMs" : 8.59645, "averageMs" : 0.00646836, "medianMs" : 0.006144, "percentage" : 0.082039 } , { "name" : "pts_bbox_head.task_heads.1.rot.1.weight + QuantizeLinear_594 + Conv_596", "timeMs" : 73.2762, "averageMs" : 0.0551364, "medianMs" : 0.055296, "percentage" : 0.699301 } , { "name" : "pts_bbox_head.task_heads.1.heatmap.1.weight + QuantizeLinear_620 + Conv_622", "timeMs" : 73.5139, "averageMs" : 0.0553152, "medianMs" : 0.055296, "percentage" : 0.70157 } , { "name" : "pts_bbox_head.task_heads.2.reg.1.weight + QuantizeLinear_645 + Conv_647", "timeMs" : 73.9685, "averageMs" : 0.0556573, "medianMs" : 0.055296, "percentage" : 0.705908 } , { "name" : "pts_bbox_head.task_heads.2.height.1.weight + QuantizeLinear_671 + Conv_673", "timeMs" : 73.0457, "averageMs" : 0.0549629, "medianMs" : 0.054272, "percentage" : 0.697101 } , { "name" : "pts_bbox_head.task_heads.2.dim.1.weight + QuantizeLinear_696 + Conv_698", "timeMs" : 73.2783, "averageMs" : 0.055138, "medianMs" : 0.055296, "percentage" : 0.699322 } , { "name" : "PWN(Exp_903)", "timeMs" : 8.60567, "averageMs" : 0.00647529, "medianMs" : 0.006144, "percentage" : 0.082127 } , { "name" : "pts_bbox_head.task_heads.2.rot.1.weight + QuantizeLinear_721 + Conv_723", "timeMs" : 73.6224, "averageMs" : 0.0553968, "medianMs" : 0.055296, "percentage" : 0.702605 } , { "name" : "pts_bbox_head.task_heads.2.heatmap.1.weight + QuantizeLinear_747 + Conv_749", "timeMs" : 72.9269, "averageMs" : 0.0548735, "medianMs" : 0.054272, "percentage" : 0.695968 } , { "name" : "pts_bbox_head.task_heads.3.reg.1.weight + QuantizeLinear_772 + Conv_774", "timeMs" : 73.4781, "averageMs" : 0.0552882, "medianMs" : 0.055296, "percentage" : 0.701227 } , { "name" : "pts_bbox_head.task_heads.3.height.1.weight + QuantizeLinear_798 + Conv_800", "timeMs" : 73.4033, "averageMs" : 0.055232, "medianMs" : 0.055296, "percentage" : 0.700514 } , { "name" : "pts_bbox_head.task_heads.3.dim.1.weight + QuantizeLinear_823 + Conv_825", "timeMs" : 74.5949, "averageMs" : 0.0561286, "medianMs" : 0.05632, "percentage" : 0.711886 } , { "name" : "PWN(Exp_912)", "timeMs" : 8.59747, "averageMs" : 0.00646913, "medianMs" : 0.006144, "percentage" : 0.0820488 } , { "name" : "pts_bbox_head.task_heads.3.rot.1.weight + QuantizeLinear_848 + Conv_850", "timeMs" : 72.0511, "averageMs" : 0.0542145, "medianMs" : 0.054272, "percentage" : 0.687609 } , { "name" : "pts_bbox_head.task_heads.3.heatmap.1.weight + QuantizeLinear_874 + Conv_876", "timeMs" : 74.1088, "averageMs" : 0.0557628, "medianMs" : 0.055296, "percentage" : 0.707247 } , { "name" : "{ForeignNode[onnx::Gather_1010...Concat_913]}", "timeMs" : 31.0906, "averageMs" : 0.023394, "medianMs" : 0.023552, "percentage" : 0.296708 } ]

{"Layers": ["QuantizeLinear_111" ,"pts_backbone.blocks.0.0.weight + QuantizeLinear_116 + Conv_118 + Relu_119" ,"pts_backbone.blocks.0.3.weight + QuantizeLinear_129 + Conv_131 + Relu_132" ,"pts_backbone.blocks.0.6.weight + QuantizeLinear_142 + Conv_144 + Relu_145" ,"pts_backbone.blocks.0.9.weight + QuantizeLinear_155 + Conv_157 + Relu_158" ,"pts_neck.deblocks.0.0.weight + QuantizeLinear_324 + Conv_326 + Relu_327" ,"pts_backbone.blocks.1.0.weight + QuantizeLinear_168 + Conv_170 + Relu_171" ,"pts_backbone.blocks.1.3.weight + QuantizeLinear_181 + Conv_183 + Relu_184" ,"pts_backbone.blocks.1.6.weight + QuantizeLinear_194 + Conv_196 + Relu_197" ,"pts_backbone.blocks.1.9.weight + QuantizeLinear_207 + Conv_209 + Relu_210" ,"pts_backbone.blocks.1.12.weight + QuantizeLinear_220 + Conv_222 + Relu_223" ,"pts_backbone.blocks.1.15.weight + QuantizeLinear_233 + Conv_235 + Relu_236" ,"pts_neck.deblocks.1.0.weight + QuantizeLinear_337 + Conv_339 + Relu_340" ,"pts_backbone.blocks.2.0.weight + QuantizeLinear_246 + Conv_248 + Relu_249" ,"pts_backbone.blocks.2.3.weight + QuantizeLinear_259 + Conv_261 + Relu_262" ,"pts_backbone.blocks.2.6.weight + QuantizeLinear_272 + Conv_274 + Relu_275" ,"pts_backbone.blocks.2.9.weight + QuantizeLinear_285 + Conv_287 + Relu_288" ,"pts_backbone.blocks.2.12.weight + QuantizeLinear_298 + Conv_300 + Relu_301" ,"pts_backbone.blocks.2.15.weight + QuantizeLinear_311 + Conv_313 + Relu_314" ,"pts_neck.deblocks.2.0.weight + QuantizeLinear_350 + ConvTranspose_352" ,"BatchNormalization_353 + Relu_354" ,"QuantizeLinear_360_clone_2" ,"pts_bbox_head.shared_conv.conv.weight + QuantizeLinear_365 + Conv_367 + Relu_368" ,"pts_bbox_head.task_heads.3.heatmap.0.conv.weight + QuantizeLinear_860 + Conv_862 + Relu_863 || pts_bbox_head.task_heads.3.rot.0.conv.weight + QuantizeLinear_835 + Conv_837 + Relu_838 || pts_bbox_head.task_heads.3.dim.0.conv.weight + QuantizeLinear_810 + Conv_812 + Relu_813 || pts_bbox_head.task_heads.3.height.0.conv.weight + QuantizeLinear_784 + Conv_786 + Relu_787 || pts_bbox_head.task_heads.3.reg.0.conv.weight + QuantizeLinear_759 + Conv_761 + Relu_762 || pts_bbox_head.task_heads.2.heatmap.0.conv.weight + QuantizeLinear_733 + Conv_735 + Relu_736 || pts_bbox_head.task_heads.2.rot.0.conv.weight + QuantizeLinear_708 + Conv_710 + Relu_711 || pts_bbox_head.task_heads.2.dim.0.conv.weight + QuantizeLinear_683 + Conv_685 + Relu_686" ,"pts_bbox_head.task_heads.2.height.0.conv.weight + QuantizeLinear_657 + Conv_659 + Relu_660 || pts_bbox_head.task_heads.2.reg.0.conv.weight + QuantizeLinear_632 + Conv_634 + Relu_635 || pts_bbox_head.task_heads.1.heatmap.0.conv.weight + QuantizeLinear_606 + Conv_608 + Relu_609 || pts_bbox_head.task_heads.1.rot.0.conv.weight + QuantizeLinear_581 + Conv_583 + Relu_584 || pts_bbox_head.task_heads.1.dim.0.conv.weight + QuantizeLinear_556 + Conv_558 + Relu_559 || pts_bbox_head.task_heads.1.height.0.conv.weight + QuantizeLinear_530 + Conv_532 + Relu_533 || pts_bbox_head.task_heads.1.reg.0.conv.weight + QuantizeLinear_505 + Conv_507 + Relu_508 || pts_bbox_head.task_heads.0.heatmap.0.conv.weight + QuantizeLinear_479 + Conv_481 + Relu_482" ,"pts_bbox_head.task_heads.0.rot.0.conv.weight + QuantizeLinear_454 + Conv_456 + Relu_457 || pts_bbox_head.task_heads.0.dim.0.conv.weight + QuantizeLinear_429 + Conv_431 + Relu_432 || pts_bbox_head.task_heads.0.reg.0.conv.weight + QuantizeLinear_378 + Conv_380 + Relu_381" ,"pts_bbox_head.task_heads.0.height.0.conv.weight + QuantizeLinear_403 + Conv_405 + Relu_406" ,"Reformatting CopyNode for Output Tensor 0 to pts_bbox_head.task_heads.0.height.0.conv.weight + QuantizeLinear_403 + Conv_405 + Relu_406" ,"pts_bbox_head.task_heads.0.reg.1.weight + QuantizeLinear_391 + Conv_393" ,"Reformatting CopyNode for Input Tensor 0 to pts_bbox_head.task_heads.0.height.1.weight + QuantizeLinear_417 + Conv_419" ,"pts_bbox_head.task_heads.0.height.1.weight + QuantizeLinear_417 + Conv_419" ,"pts_bbox_head.task_heads.0.dim.1.weight + QuantizeLinear_442 + Conv_444" ,"PWN(Exp_885)" ,"pts_bbox_head.task_heads.0.rot.1.weight + QuantizeLinear_467 + Conv_469" ,"pts_bbox_head.task_heads.0.heatmap.1.weight + QuantizeLinear_493 + Conv_495" ,"pts_bbox_head.task_heads.1.reg.1.weight + QuantizeLinear_518 + Conv_520" ,"pts_bbox_head.task_heads.1.height.1.weight + QuantizeLinear_544 + Conv_546" ,"pts_bbox_head.task_heads.1.dim.1.weight + QuantizeLinear_569 + Conv_571" ,"PWN(Exp_894)" ,"pts_bbox_head.task_heads.1.rot.1.weight + QuantizeLinear_594 + Conv_596" ,"pts_bbox_head.task_heads.1.heatmap.1.weight + QuantizeLinear_620 + Conv_622" ,"pts_bbox_head.task_heads.2.reg.1.weight + QuantizeLinear_645 + Conv_647" ,"pts_bbox_head.task_heads.2.height.1.weight + QuantizeLinear_671 + Conv_673" ,"pts_bbox_head.task_heads.2.dim.1.weight + QuantizeLinear_696 + Conv_698" ,"PWN(Exp_903)" ,"pts_bbox_head.task_heads.2.rot.1.weight + QuantizeLinear_721 + Conv_723" ,"pts_bbox_head.task_heads.2.heatmap.1.weight + QuantizeLinear_747 + Conv_749" ,"pts_bbox_head.task_heads.3.reg.1.weight + QuantizeLinear_772 + Conv_774" ,"pts_bbox_head.task_heads.3.height.1.weight + QuantizeLinear_798 + Conv_800" ,"pts_bbox_head.task_heads.3.dim.1.weight + QuantizeLinear_823 + Conv_825" ,"PWN(Exp_912)" ,"pts_bbox_head.task_heads.3.rot.1.weight + QuantizeLinear_848 + Conv_850" ,"pts_bbox_head.task_heads.3.heatmap.1.weight + QuantizeLinear_874 + Conv_876" ,"{ForeignNode[onnx::Gather_1010...Concat_913]}" ], "Bindings": ["inp" ,"onnx::DequantizeLinear_544" ,"rotc" ,"rots" ,"height" ,"dim" ,"reg" ,"heatmap" ]} That is how I modify the model , this will make the output of the height[0, 0, :, :] become correct.

zerollzeng commented 1 year ago

I don't quite understand the issue here, could you give a detailed reproduce? Thanks

FYI: check it with TRT 8.5.1.7(docker 22.12).

[I] Accuracy Comparison | trt-runner-N0-03/28/23-13:32:44 vs. onnxrt-runner-N0-03/28/23-13:32:44
[I]     Comparing Output: 'rotc' (dtype=float32, shape=(1, 4, 160, 320)) with 'rotc' (dtype=float32, shape=(1, 4, 160, 320))
[I]         Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I]         trt-runner-N0-03/28/23-13:32:44: rotc | Stats: mean=-0.014114, std-dev=0.53309, var=0.28419, median=-0.13501, min=-1.1774 at (0, 3, 80, 309), max=1.16 at (0, 2, 5, 247), avg-magnitude=0.43398
[I]             ---- Histogram ----
                Bin Range            |  Num Elems | Visualization
                (-1.18   , -0.944  ) |       1781 | #
                (-0.944  , -0.71   ) |      17061 | ##########
                (-0.71   , -0.476  ) |      20764 | ############
                (-0.476  , -0.242  ) |      27990 | #################
                (-0.242  , -0.00872) |      64901 | ########################################
                (-0.00872, 0.225   ) |      16286 | ##########
                (0.225   , 0.459   ) |       7238 | ####
                (0.459   , 0.693   ) |       9445 | #####
                (0.693   , 0.926   ) |      30803 | ##################
                (0.926   , 1.16    ) |       8531 | #####
[I]         onnxrt-runner-N0-03/28/23-13:32:44: rotc | Stats: mean=-0.014614, std-dev=0.53311, var=0.28421, median=-0.1359, min=-1.1762 at (0, 3, 80, 309), max=1.16 at (0, 2, 5, 247), avg-magnitude=0.43415
[I]             ---- Histogram ----
                Bin Range            |  Num Elems | Visualization
                (-1.18   , -0.944  ) |       1779 | #
                (-0.944  , -0.71   ) |      17057 | ##########
                (-0.71   , -0.476  ) |      20819 | ############
                (-0.476  , -0.242  ) |      28031 | #################
                (-0.242  , -0.00872) |      64988 | ########################################
                (-0.00872, 0.225   ) |      16179 | #########
                (0.225   , 0.459   ) |       7184 | ####
                (0.459   , 0.693   ) |       9418 | #####
                (0.693   , 0.926   ) |      30816 | ##################
                (0.926   , 1.16    ) |       8529 | #####
[I]         Error Metrics: rotc
[I]             Minimum Required Tolerance: elemwise error | [abs=0.34744] OR [rel=1632.9] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=0.0055609, std-dev=0.013131, var=0.00017243, median=0.0012105, min=0 at (0, 0, 6, 319), max=0.34744 at (0, 3, 144, 281), avg-magnitude=0.0055609
[I]                 ---- Histogram ----
                    Bin Range        |  Num Elems | Visualization
                    (0     , 0.0347) |     198457 | ########################################
                    (0.0347, 0.0695) |       4728 |
                    (0.0695, 0.104 ) |       1044 |
                    (0.104 , 0.139 ) |        351 |
                    (0.139 , 0.174 ) |        124 |
                    (0.174 , 0.208 ) |         56 |
                    (0.208 , 0.243 ) |         23 |
                    (0.243 , 0.278 ) |          8 |
                    (0.278 , 0.313 ) |          7 |
                    (0.313 , 0.347 ) |          2 |
[I]             Relative Difference | Stats: mean=0.079962, std-dev=4.2391, var=17.97, median=0.0040368, min=0 at (0, 0, 6, 319), max=1632.9 at (0, 0, 82, 241), avg-magnitude=0.079962
[I]                 ---- Histogram ----
                    Bin Range            |  Num Elems | Visualization
                    (0       , 163     ) |     204794 | ########################################
                    (163     , 327     ) |          3 |
                    (327     , 490     ) |          0 |
                    (490     , 653     ) |          1 |
                    (653     , 816     ) |          1 |
                    (816     , 980     ) |          0 |
                    (980     , 1.14e+03) |          0 |
                    (1.14e+03, 1.31e+03) |          0 |
                    (1.31e+03, 1.47e+03) |          0 |
                    (1.47e+03, 1.63e+03) |          1 |
[E]         FAILED | Output: 'rotc' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[I]     Comparing Output: 'rots' (dtype=float32, shape=(1, 4, 160, 320)) with 'rots' (dtype=float32, shape=(1, 4, 160, 320))
[I]         Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I]         trt-runner-N0-03/28/23-13:32:44: rots | Stats: mean=-0.051449, std-dev=0.20648, var=0.042635, median=-0.1175, min=-0.71027 at (0, 0, 159, 0), max=0.77942 at (0, 3, 97, 318), avg-magnitude=0.18564
[I]             ---- Histogram ----
                Bin Range        |  Num Elems | Visualization
                (-0.71 , -0.561) |         25 |
                (-0.561, -0.412) |        440 |
                (-0.412, -0.263) |      18955 | ########
                (-0.263, -0.114) |      84328 | ########################################
                (-0.114, 0.0346) |      43205 | ####################
                (0.0346, 0.184 ) |      19755 | #########
                (0.184 , 0.333 ) |      27342 | ############
                (0.333 , 0.481 ) |       8838 | ####
                (0.481 , 0.63  ) |       1611 |
                (0.63  , 0.779 ) |        301 |
[I]         onnxrt-runner-N0-03/28/23-13:32:44: rots | Stats: mean=-0.050523, std-dev=0.20639, var=0.042598, median=-0.11646, min=-0.71027 at (0, 0, 159, 0), max=0.77942 at (0, 3, 97, 318), avg-magnitude=0.18524
[I]             ---- Histogram ----
                Bin Range        |  Num Elems | Visualization
                (-0.71 , -0.561) |         27 |
                (-0.561, -0.412) |        425 |
                (-0.412, -0.263) |      18727 | ########
                (-0.263, -0.114) |      84114 | ########################################
                (-0.114, 0.0346) |      43274 | ####################
                (0.0346, 0.184 ) |      20094 | #########
                (0.184 , 0.333 ) |      27318 | ############
                (0.333 , 0.481 ) |       8910 | ####
                (0.481 , 0.63  ) |       1606 |
                (0.63  , 0.779 ) |        305 |
[I]         Error Metrics: rots
[I]             Minimum Required Tolerance: elemwise error | [abs=0.1114] OR [rel=7694.1] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=0.0045772, std-dev=0.0075856, var=5.7541e-05, median=0.0019067, min=0 at (0, 0, 3, 85), max=0.1114 at (0, 3, 63, 204), avg-magnitude=0.0045772
[I]                 ---- Histogram ----
                    Bin Range        |  Num Elems | Visualization
                    (0     , 0.0111) |     178777 | ########################################
                    (0.0111, 0.0223) |      18232 | ####
                    (0.0223, 0.0334) |       5240 | #
                    (0.0334, 0.0446) |       1662 |
                    (0.0446, 0.0557) |        539 |
                    (0.0557, 0.0668) |        193 |
                    (0.0668, 0.078 ) |         94 |
                    (0.078 , 0.0891) |         41 |
                    (0.0891, 0.1   ) |         15 |
                    (0.1   , 0.111 ) |          7 |
[I]             Relative Difference | Stats: mean=0.22255, std-dev=21.065, var=443.73, median=0.0094848, min=0 at (0, 0, 3, 85), max=7694.1 at (0, 0, 44, 80), avg-magnitude=0.22255
[I]                 ---- Histogram ----
                    Bin Range            |  Num Elems | Visualization
                    (0       , 769     ) |     204794 | ########################################
                    (769     , 1.54e+03) |          3 |
                    (1.54e+03, 2.31e+03) |          0 |
                    (2.31e+03, 3.08e+03) |          1 |
                    (3.08e+03, 3.85e+03) |          0 |
                    (3.85e+03, 4.62e+03) |          1 |
                    (4.62e+03, 5.39e+03) |          0 |
                    (5.39e+03, 6.16e+03) |          0 |
                    (6.16e+03, 6.92e+03) |          0 |
                    (6.92e+03, 7.69e+03) |          1 |
[E]         FAILED | Output: 'rots' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[I]     Comparing Output: 'height' (dtype=float32, shape=(1, 4, 160, 320)) with 'height' (dtype=float32, shape=(1, 4, 160, 320))
[I]         Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I]         trt-runner-N0-03/28/23-13:32:44: height | Stats: mean=2.1949, std-dev=0.5272, var=0.27794, median=2.0496, min=-0.5823 at (0, 3, 64, 319), max=3.5418 at (0, 1, 106, 128), avg-magnitude=2.1949
[I]             ---- Histogram ----
                Bin Range       |  Num Elems | Visualization
                (-0.582, -0.17) |          9 |
                (-0.17 , 0.243) |         41 |
                (0.243 , 0.655) |        198 |
                (0.655 , 1.07 ) |        514 |
                (1.07  , 1.48 ) |       6482 | ##
                (1.48  , 1.89 ) |      53925 | ########################
                (1.89  , 2.3  ) |      86509 | ########################################
                (2.3   , 2.72 ) |      12163 | #####
                (2.72  , 3.13 ) |      28930 | #############
                (3.13  , 3.54 ) |      16029 | #######
[I]         onnxrt-runner-N0-03/28/23-13:32:44: height | Stats: mean=2.1951, std-dev=0.52698, var=0.27771, median=2.05, min=-0.57648 at (0, 3, 64, 319), max=3.5391 at (0, 1, 104, 128), avg-magnitude=2.1952
[I]             ---- Histogram ----
                Bin Range       |  Num Elems | Visualization
                (-0.582, -0.17) |          9 |
                (-0.17 , 0.243) |         42 |
                (0.243 , 0.655) |        201 |
                (0.655 , 1.07 ) |        509 |
                (1.07  , 1.48 ) |       6471 | ##
                (1.48  , 1.89 ) |      53791 | ########################
                (1.89  , 2.3  ) |      86667 | ########################################
                (2.3   , 2.72 ) |      12138 | #####
                (2.72  , 3.13 ) |      28989 | #############
                (3.13  , 3.54 ) |      15983 | #######
[I]         Error Metrics: height
[I]             Minimum Required Tolerance: elemwise error | [abs=0.12059] OR [rel=2.005] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=0.0062766, std-dev=0.011127, var=0.00012381, median=2.3842e-07, min=0 at (0, 0, 0, 23), max=0.12059 at (0, 2, 120, 35), avg-magnitude=0.0062766
[I]                 ---- Histogram ----
                    Bin Range        |  Num Elems | Visualization
                    (0     , 0.0121) |     165640 | ########################################
                    (0.0121, 0.0241) |      23447 | #####
                    (0.0241, 0.0362) |       9484 | ##
                    (0.0362, 0.0482) |       3737 |
                    (0.0482, 0.0603) |       1529 |
                    (0.0603, 0.0724) |        613 |
                    (0.0724, 0.0844) |        232 |
                    (0.0844, 0.0965) |         87 |
                    (0.0965, 0.109 ) |         22 |
                    (0.109 , 0.121 ) |          9 |
[I]             Relative Difference | Stats: mean=0.0030295, std-dev=0.0074513, var=5.5522e-05, median=1.3162e-07, min=0 at (0, 0, 0, 23), max=2.005 at (0, 3, 81, 319), avg-magnitude=0.0030295
[I]                 ---- Histogram ----
                    Bin Range      |  Num Elems | Visualization
                    (0    , 0.2  ) |     204797 | ########################################
                    (0.2  , 0.401) |          1 |
                    (0.401, 0.601) |          0 |
                    (0.601, 0.802) |          0 |
                    (0.802, 1    ) |          1 |
                    (1    , 1.2  ) |          0 |
                    (1.2  , 1.4  ) |          0 |
                    (1.4  , 1.6  ) |          0 |
                    (1.6  , 1.8  ) |          0 |
                    (1.8  , 2    ) |          1 |
[E]         FAILED | Output: 'height' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[I]     Comparing Output: 'dim' (dtype=float32, shape=(1, 12, 160, 320)) with 'dim' (dtype=float32, shape=(1, 12, 160, 320))
[I]         Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I]         trt-runner-N0-03/28/23-13:32:44: dim | Stats: mean=2.3199, std-dev=1.9537, var=3.817, median=1.6584, min=0.52695 at (0, 10, 109, 2), max=13.937 at (0, 3, 158, 1), avg-magnitude=2.3199
[I]             ---- Histogram ----
                Bin Range     |  Num Elems | Visualization
                (0.527, 1.87) |     401609 | ########################################
                (1.87 , 3.21) |      99100 | #########
                (3.21 , 4.55) |      62366 | ######
                (4.55 , 5.89) |        130 |
                (5.89 , 7.23) |       5833 |
                (7.23 , 8.57) |      37408 | ###
                (8.57 , 9.91) |       7687 |
                (9.91 , 11.3) |        224 |
                (11.3 , 12.6) |         41 |
                (12.6 , 13.9) |          2 |
[I]         onnxrt-runner-N0-03/28/23-13:32:44: dim | Stats: mean=2.32, std-dev=1.9537, var=3.817, median=1.6585, min=0.52695 at (0, 10, 109, 2), max=13.937 at (0, 3, 158, 1), avg-magnitude=2.32
[I]             ---- Histogram ----
                Bin Range     |  Num Elems | Visualization
                (0.527, 1.87) |     401590 | ########################################
                (1.87 , 3.21) |      99117 | #########
                (3.21 , 4.55) |      62369 | ######
                (4.55 , 5.89) |        129 |
                (5.89 , 7.23) |       5851 |
                (7.23 , 8.57) |      37400 | ###
                (8.57 , 9.91) |       7676 |
                (9.91 , 11.3) |        224 |
                (11.3 , 12.6) |         42 |
                (12.6 , 13.9) |          2 |
[I]         Error Metrics: dim
[I]             Minimum Required Tolerance: elemwise error | [abs=0.48827] OR [rel=0.067446] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=0.0045923, std-dev=0.015912, var=0.00025319, median=2.3842e-07, min=0 at (0, 0, 0, 0), max=0.48827 at (0, 3, 62, 164), avg-magnitude=0.0045923
[I]                 ---- Histogram ----
                    Bin Range        |  Num Elems | Visualization
                    (0     , 0.0488) |     602791 | ########################################
                    (0.0488, 0.0977) |       7191 |
                    (0.0977, 0.146 ) |       2762 |
                    (0.146 , 0.195 ) |       1091 |
                    (0.195 , 0.244 ) |        393 |
                    (0.244 , 0.293 ) |        114 |
                    (0.293 , 0.342 ) |         47 |
                    (0.342 , 0.391 ) |          9 |
                    (0.391 , 0.439 ) |          1 |
                    (0.439 , 0.488 ) |          1 |
[I]             Relative Difference | Stats: mean=0.0017682, std-dev=0.0040768, var=1.662e-05, median=9.573e-08, min=0 at (0, 0, 0, 0), max=0.067446 at (0, 9, 140, 285), avg-magnitude=0.0017682
[I]                 ---- Histogram ----
                    Bin Range          |  Num Elems | Visualization
                    (0      , 0.00674) |     561101 | ########################################
                    (0.00674, 0.0135 ) |      35191 | ##
                    (0.0135 , 0.0202 ) |      13374 |
                    (0.0202 , 0.027  ) |       2743 |
                    (0.027  , 0.0337 ) |       1331 |
                    (0.0337 , 0.0405 ) |        492 |
                    (0.0405 , 0.0472 ) |         58 |
                    (0.0472 , 0.054  ) |         99 |
                    (0.054  , 0.0607 ) |          8 |
                    (0.0607 , 0.0674 ) |          3 |
[E]         FAILED | Output: 'dim' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[I]     Comparing Output: 'reg' (dtype=float32, shape=(1, 8, 160, 320)) with 'reg' (dtype=float32, shape=(1, 8, 160, 320))
[I]         Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I]         trt-runner-N0-03/28/23-13:32:44: reg | Stats: mean=0.49274, std-dev=0.057861, var=0.0033479, median=0.4946, min=0.088482 at (0, 1, 0, 244), max=1.0677 at (0, 7, 159, 264), avg-magnitude=0.49274
[I]             ---- Histogram ----
                Bin Range       |  Num Elems | Visualization
                (0.0885, 0.187) |        210 |
                (0.187 , 0.285) |       1091 |
                (0.285 , 0.384) |      11117 | #
                (0.384 , 0.482) |     151945 | ##########################
                (0.482 , 0.58 ) |     226910 | ########################################
                (0.58  , 0.679) |      16702 | ##
                (0.679 , 0.777) |        980 |
                (0.777 , 0.876) |        368 |
                (0.876 , 0.974) |        218 |
                (0.974 , 1.07 ) |         59 |
[I]         onnxrt-runner-N0-03/28/23-13:32:44: reg | Stats: mean=0.49275, std-dev=0.057859, var=0.0033477, median=0.49456, min=0.088482 at (0, 1, 0, 244), max=1.0723 at (0, 7, 159, 264), avg-magnitude=0.49275
[I]             ---- Histogram ----
                Bin Range       |  Num Elems | Visualization
                (0.0885, 0.187) |        209 |
                (0.187 , 0.285) |       1087 |
                (0.285 , 0.384) |      11103 | #
                (0.384 , 0.482) |     151954 | ##########################
                (0.482 , 0.58 ) |     226990 | ########################################
                (0.58  , 0.679) |      16626 | ##
                (0.679 , 0.777) |        986 |
                (0.777 , 0.876) |        368 |
                (0.876 , 0.974) |        220 |
                (0.974 , 1.07 ) |         57 |
[I]         Error Metrics: reg
[I]             Minimum Required Tolerance: elemwise error | [abs=0.036269] OR [rel=0.09048] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=0.0020523, std-dev=0.0033401, var=1.1156e-05, median=5.9605e-08, min=0 at (0, 0, 0, 22), max=0.036269 at (0, 0, 124, 151), avg-magnitude=0.0020523
[I]                 ---- Histogram ----
                    Bin Range          |  Num Elems | Visualization
                    (0      , 0.00363) |     319152 | ########################################
                    (0.00363, 0.00725) |      54361 | ######
                    (0.00725, 0.0109 ) |      23600 | ##
                    (0.0109 , 0.0145 ) |       8633 | #
                    (0.0145 , 0.0181 ) |       2807 |
                    (0.0181 , 0.0218 ) |        775 |
                    (0.0218 , 0.0254 ) |        206 |
                    (0.0254 , 0.029  ) |         47 |
                    (0.029  , 0.0326 ) |         15 |
                    (0.0326 , 0.0363 ) |          4 |
[I]             Relative Difference | Stats: mean=0.0042154, std-dev=0.0069295, var=4.8017e-05, median=1.168e-07, min=0 at (0, 0, 0, 22), max=0.09048 at (0, 7, 0, 12), avg-magnitude=0.0042154
[I]                 ---- Histogram ----
                    Bin Range          |  Num Elems | Visualization
                    (0      , 0.00905) |     334913 | ########################################
                    (0.00905, 0.0181 ) |      51390 | ######
                    (0.0181 , 0.0271 ) |      17074 | ##
                    (0.0271 , 0.0362 ) |       4737 |
                    (0.0362 , 0.0452 ) |       1123 |
                    (0.0452 , 0.0543 ) |        268 |
                    (0.0543 , 0.0633 ) |         70 |
                    (0.0633 , 0.0724 ) |         15 |
                    (0.0724 , 0.0814 ) |          7 |
                    (0.0814 , 0.0905 ) |          3 |
[E]         FAILED | Output: 'reg' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[I]     Comparing Output: 'heatmap' (dtype=float32, shape=(1, 4, 160, 320)) with 'heatmap' (dtype=float32, shape=(1, 4, 160, 320))
[I]         Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I]         trt-runner-N0-03/28/23-13:32:44: heatmap | Stats: mean=-6.8228, std-dev=0.96949, var=0.93991, median=-7.1304, min=-9.5214 at (0, 1, 43, 6), max=-3.2045 at (0, 0, 159, 319), avg-magnitude=6.8228
[I]             ---- Histogram ----
                Bin Range      |  Num Elems | Visualization
                (-9.52, -8.89) |        208 |
                (-8.89, -8.26) |       4153 | ##
                (-8.26, -7.63) |      33442 | ################
                (-7.63, -6.99) |      80859 | ########################################
                (-6.99, -6.36) |      31314 | ###############
                (-6.36, -5.73) |       4345 | ##
                (-5.73, -5.1 ) |      42928 | #####################
                (-5.1 , -4.47) |       7232 | ###
                (-4.47, -3.84) |        309 |
                (-3.84, -3.2 ) |         10 |
[I]         onnxrt-runner-N0-03/28/23-13:32:44: heatmap | Stats: mean=-6.8223, std-dev=0.96948, var=0.93989, median=-7.1301, min=-9.5214 at (0, 1, 43, 6), max=-3.2045 at (0, 0, 159, 319), avg-magnitude=6.8223
[I]             ---- Histogram ----
                Bin Range      |  Num Elems | Visualization
                (-9.52, -8.89) |        219 |
                (-8.89, -8.26) |       4135 | ##
                (-8.26, -7.63) |      33401 | ################
                (-7.63, -6.99) |      80904 | ########################################
                (-6.99, -6.36) |      31309 | ###############
                (-6.36, -5.73) |       4355 | ##
                (-5.73, -5.1 ) |      42923 | #####################
                (-5.1 , -4.47) |       7236 | ###
                (-4.47, -3.84) |        308 |
                (-3.84, -3.2 ) |         10 |
[I]         Error Metrics: heatmap
[I]             Minimum Required Tolerance: elemwise error | [abs=0.27722] OR [rel=0.035072] (requirements may be lower if both abs/rel tolerances are set)
[I]             Absolute Difference | Stats: mean=0.01111, std-dev=0.019301, var=0.00037253, median=1.9073e-06, min=0 at (0, 0, 0, 25), max=0.27722 at (0, 1, 48, 93), avg-magnitude=0.01111
[I]                 ---- Histogram ----
                    Bin Range        |  Num Elems | Visualization
                    (0     , 0.0277) |     174344 | ########################################
                    (0.0277, 0.0554) |      21820 | #####
                    (0.0554, 0.0832) |       6306 | #
                    (0.0832, 0.111 ) |       1674 |
                    (0.111 , 0.139 ) |        487 |
                    (0.139 , 0.166 ) |        118 |
                    (0.166 , 0.194 ) |         38 |
                    (0.194 , 0.222 ) |          6 |
                    (0.222 , 0.249 ) |          5 |
                    (0.249 , 0.277 ) |          2 |
[I]             Relative Difference | Stats: mean=0.001635, std-dev=0.002817, var=7.9356e-06, median=2.7015e-07, min=0 at (0, 0, 0, 25), max=0.035072 at (0, 1, 48, 93), avg-magnitude=0.001635
[I]                 ---- Histogram ----
                    Bin Range          |  Num Elems | Visualization
                    (0      , 0.00351) |     168600 | ########################################
                    (0.00351, 0.00701) |      23784 | #####
                    (0.00701, 0.0105 ) |       8446 | ##
                    (0.0105 , 0.014  ) |       2729 |
                    (0.014  , 0.0175 ) |        857 |
                    (0.0175 , 0.021  ) |        277 |
                    (0.021  , 0.0246 ) |         82 |
                    (0.0246 , 0.0281 ) |         16 |
                    (0.0281 , 0.0316 ) |          6 |
                    (0.0316 , 0.0351 ) |          3 |
[E]         FAILED | Output: 'heatmap' | Difference exceeds tolerance (rel=1e-05, abs=1e-05)
[E]     FAILED | Mismatched outputs: ['rotc', 'rots', 'height', 'dim', 'reg', 'heatmap']
[E] Accuracy Summary | trt-runner-N0-03/28/23-13:32:44 vs. onnxrt-runner-N0-03/28/23-13:32:44 | Passed: 0/1 iterations | Pass Rate: 0.0%
[E] FAILED | Runtime: 54.954s | Command: /usr/local/bin/polygraphy run CenterPointRPN.onnx --trt --int8 --onnxrt
zerollzeng commented 1 year ago

But I can see the model fail at TRT 8.6. will take a further check.

[03/28/2023-13:38:57] [V] [TRT] ConstWeightsFusion: Fusing pts_bbox_head.task_heads.3.heatmap.0.conv.weight + QuantizeLinear_860 with Conv_862 + Relu_863
[03/28/2023-13:38:57] [V] [TRT] Running: ConstWeightsFusion on pts_bbox_head.task_heads.3.heatmap.1.weight + QuantizeLinear_874
[03/28/2023-13:38:57] [V] [TRT] ConstWeightsFusion: Fusing pts_bbox_head.task_heads.3.heatmap.1.weight + QuantizeLinear_874 with Conv_876
[03/28/2023-13:38:57] [V] [TRT] After dupe layer removal: 73 layers
[03/28/2023-13:38:57] [V] [TRT] After final dead-layer removal: 73 layers
[03/28/2023-13:38:57] [E] Error[2]: [standardBuilderUtils.cpp::canStride::90] Error Code 2: Internal Error (Assertion !layerImpls.empty() failed. Exp_885 has no RunnerBuilders)
[03/28/2023-13:38:57] [E] Engine could not be created from network
[03/28/2023-13:38:57] [E] Building engine failed
[03/28/2023-13:38:57] [E] Failed to create engine from model or file.
[03/28/2023-13:38:57] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8610] # trtexec --onnx=CenterPointRPN.onnx --int8 --verbose

Update: filed internal bug 4048138 for this.

zerollzeng commented 1 year ago

cc @ttyio

AdanWang commented 1 year ago

I uploaded some files which may help you to reproduce this issue. Please check. @zerollzeng https://drive.google.com/file/d/1xgDGNzcPLk76uyfI-OViZyxpwdrQhSAT/view?usp=share_link

AdanWang commented 1 year ago

I found that this problem wont happen in tensorrt 8.5. I think something might be fixed in trt8.5. @zerollzeng

zerollzeng commented 1 year ago

Glad to know that :-)