quant_pre_process failed on NonMaxSuppression

Describe the issue

We are trying to quantize our proprietary model based on RetinaNet using TensorRT's model optimization library. The following warning was raised: "Please consider running pre-processing before quantization." Hoping for performance improvement, I tried running:

quant_pre_process(onnx_orig_path, onnx_preprocessed_path, verbose=True)

but faced the following error:

unsupported broadcast between NonMaxSuppression_1929_o0__d0 300
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/onnxruntime/quantization/shape_inference.py", line 81, in quant_pre_process
    model = SymbolicShapeInference.infer_shapes(
  File "/usr/local/lib/python3.10/dist-packages/onnxruntime/tools/symbolic_shape_infer.py", line 2912, in infer_shapes
    raise Exception("Incomplete symbolic shape inference")
Exception: Incomplete symbolic shape inference

Here is our ONNX implementation for NMS:

    @staticmethod
    def symbolic(g, boxes, scores, iou_threshold, max_count):
        assert type(iou_threshold) == float, "You have to pass iou_threshold as float type"
        assert type(max_count) == int, "You have to pass max_count as integer type"

        boxes = unsqueeze(g, boxes, 0)
        scores = unsqueeze(g, unsqueeze(g, scores, 0), 0)
        # this value is deducted to filter out zero values from padding
        epsilon_nms = 1e-5
        score_threshold = g.op('Constant', value_t=torch.tensor([0.0 - epsilon_nms], dtype=torch.float))

        iou_threshold = g.op("Constant", value_t=torch.tensor(iou_threshold))
        max_count = g.op('Constant', value_t=torch.tensor(max_count))

        nms_out = g.op('NonMaxSuppression', boxes, scores, max_count, iou_threshold, score_threshold)
        return squeeze(g, select(g, nms_out, 1, g.op('Constant', value_t=torch.tensor([2], dtype=torch.long))), 1)

Next, I tried running it without symbolic shape inference:

quant_pre_process(onnx_orig_path, onnx_preprocessed_path, verbose=True, skip_symbolic_shape=True)

and it passed, allowing me to quantize the model (using '_quantizestatic'):

INFO:root:Model /data2/projects/camera_detection_resnet18_q4_fisheye_uncertainty_fe_od_reduced/acp4/qaic/fp16/camera_detection_resnet18_q4_fisheye_uncertainty_fe_od_reduced_preprocessed.onnx with opset_version 17 is loaded.
INFO:root:Model is cloned to /data2/projects/camera_detection_resnet18_q4_fisheye_uncertainty_fe_od_reduced/acp4/qaic/fp16/camera_detection_resnet18_q4_fisheye_uncertainty_fe_od_reduced_preprocessed_named.onnx after naming the nodes.
INFO:root:Quantization Mode: int8
INFO:root:Quantizable op types in the model: ['Conv', 'Gemm', 'Clip', 'MaxPool', 'Mul', 'MatMul', 'Add']
INFO:root:Building non-residual Add input map ...
INFO:root:Searching for hard-coded patterns like MHA, LayerNorm, etc. to avoid quantization.
INFO:root:Building KGEN/CASK targeted partitions ...
INFO:root:CASK fusible partitions: [['/backbone/backbone/encoder/conv1/Conv', '/backbone/backbone/encoder/relu/Relu'], ['/backbone/backbone/encoder/layer1/layer1.0/conv1/Conv', '/backbone/backbone/encoder/layer1/layer1.0/relu1/Relu'], ['/backbone/backbone/encoder/layer1/layer1.0/conv2/Conv', '/backbone/backbone/encoder/layer1/layer1.0/add/Add'], ['/backbone/backbone/encoder/layer1/layer1.1/conv1/Conv', '/backbone/backbone/encoder/layer1/layer1.1/relu1/Relu'], ['/backbone/backbone/encoder/layer1/layer1.1/conv2/Conv', '/backbone/backbone/encoder/layer1/layer1.1/add/Add'], ['/backbone/backbone/encoder/layer2/layer2.0/downsample/downsample.0/Conv'], ['/backbone/backbone/encoder/layer2/layer2.0/conv1/Conv', '/backbone/backbone/encoder/layer2/layer2.0/relu1/Relu'], ['/backbone/backbone/encoder/layer2/layer2.0/conv2/Conv', '/backbone/backbone/encoder/layer2/layer2.0/add/Add'], ['/backbone/backbone/encoder/layer2/layer2.1/conv1/Conv', '/backbone/backbone/encoder/layer2/layer2.1/relu1/Relu'], ['/backbone/backbone/encoder/layer2/layer2.1/conv2/Conv', '/backbone/backbone/encoder/layer2/layer2.1/add/Add'], ['/backbone/backbone/encoder/layer3/layer3.0/downsample/downsample.0/Conv'], ['/backbone/backbone/encoder/layer3/layer3.0/conv1/Conv', '/backbone/backbone/encoder/layer3/layer3.0/relu1/Relu'], ['/backbone/backbone/encoder/layer3/layer3.0/conv2/Conv', '/backbone/backbone/encoder/layer3/layer3.0/add/Add'], ['/backbone/backbone/encoder/layer3/layer3.1/conv1/Conv', '/backbone/backbone/encoder/layer3/layer3.1/relu1/Relu'], ['/backbone/backbone/encoder/layer3/layer3.1/conv2/Conv', '/backbone/backbone/encoder/layer3/layer3.1/add/Add'], ['/backbone/backbone/encoder/layer4/layer4.0/downsample/downsample.0/Conv'], ['/backbone/backbone/encoder/layer4/layer4.0/conv1/Conv', '/backbone/backbone/encoder/layer4/layer4.0/relu1/Relu'], ['/backbone/backbone/lateral4/Conv', '/backbone/backbone/add_5_4/Add'], ['/backbone/backbone/encoder/layer4/layer4.0/conv2/Conv', '/backbone/backbone/encoder/layer4/layer4.0/add/Add'], ['/backbone/backbone/encoder/layer4/layer4.1/conv1/Conv', '/backbone/backbone/encoder/layer4/layer4.1/relu1/Relu'], ['/backbone/backbone/encoder/layer4/layer4.1/conv2/Conv', '/backbone/backbone/encoder/layer4/layer4.1/add/Add'], ['/backbone/backbone/pyramid6/Conv', '/backbone/backbone/relu/Relu'], ['/backbone/backbone/lateral5/Conv'], ['/head_2d/cls_head_list.2/cls_head_list.2.0/Conv', '/head_2d/cls_head_list.2/cls_head_list.2.1/Relu'], ['/backbone/backbone/smooth5/Conv'], ['/head_2d/box_head_list.2/box_head_list.2.0/Conv', '/head_2d/box_head_list.2/box_head_list.2.1/Relu'], ['/head_3d/roi_align_camera.0/extract_rois_params/conv_new.2/conv/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.2/act/Relu'], ['/head_2d/box_uncertainty_list.2/box_uncertainty_list.2.0/box_uncertainty_list.2.0.0/Conv', '/head_2d/box_uncertainty_list.2/box_uncertainty_list.2.0/box_uncertainty_list.2.0.1/Relu'], ['/backbone/backbone/pyramid7/Conv'], ['/head_2d/cls_head_list.1/cls_head_list.1.0/Conv', '/head_2d/cls_head_list.1/cls_head_list.1.1/Relu'], ['/head_2d/box_head_list.1/box_head_list.1.0/Conv', '/head_2d/box_head_list.1/box_head_list.1.1/Relu'], ['/head_3d/roi_align_camera.0/extract_rois_params/conv_new.1/conv/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.1/act/Relu'], ['/head_2d/box_uncertainty_list.1/box_uncertainty_list.1.0/box_uncertainty_list.1.0.0/Conv', '/head_2d/box_uncertainty_list.1/box_uncertainty_list.1.0/box_uncertainty_list.1.0.1/Relu'], ['/head_2d/cls_head_list.3/cls_head_list.3.0/Conv', '/head_2d/cls_head_list.3/cls_head_list.3.1/Relu'], ['/head_2d/cls_head_list.2/cls_head_list.2.2/Conv', '/head_2d/cls_head_list.2/cls_head_list.2.3/Relu'], ['/backbone/backbone/smooth4/Conv'], ['/head_2d/box_head_list.3/box_head_list.3.0/Conv', '/head_2d/box_head_list.3/box_head_list.3.1/Relu'], ['/head_2d/box_head_list.2/box_head_list.2.2/Conv', '/head_2d/box_head_list.2/box_head_list.2.3/Relu'], ['/head_3d/roi_align_camera.0/extract_rois_params/conv_new.3/conv/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.3/act/Relu'], ['/head_2d/box_uncertainty_list.3/box_uncertainty_list.3.0/box_uncertainty_list.3.0.0/Conv', '/head_2d/box_uncertainty_list.3/box_uncertainty_list.3.0/box_uncertainty_list.3.0.1/Relu'], ['/head_2d/box_uncertainty_list.2/box_uncertainty_list.2.0/box_uncertainty_list.2.0.2/Conv', '/head_2d/box_uncertainty_list.2/box_uncertainty_list.2.0/box_uncertainty_list.2.0.3/Relu'], ['/head_2d/cls_head_list.1/cls_head_list.1.2/Conv', '/head_2d/cls_head_list.1/cls_head_list.1.3/Relu'], ['/head_2d/cls_head/cls_head.0/Conv', '/head_2d/cls_head/cls_head.1/Relu'], ['/head_2d/box_head_list.1/box_head_list.1.2/Conv', '/head_2d/box_head_list.1/box_head_list.1.3/Relu'], ['/head_2d/box_head/box_head.0/Conv', '/head_2d/box_head/box_head.1/Relu'], ['/head_3d/roi_align_camera.0/extract_rois_params/conv_new.0/conv/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.0/act/Relu'], ['/head_2d/box_uncertainty_list.1/box_uncertainty_list.1.0/box_uncertainty_list.1.0.2/Conv', '/head_2d/box_uncertainty_list.1/box_uncertainty_list.1.0/box_uncertainty_list.1.0.3/Relu'], ['/head_2d/box_uncertainty/box_uncertainty.0/box_uncertainty.0.0/Conv', '/head_2d/box_uncertainty/box_uncertainty.0/box_uncertainty.0.1/Relu'], ['/head_2d/cls_head_list.3/cls_head_list.3.2/Conv', '/head_2d/cls_head_list.3/cls_head_list.3.3/Relu'], ['/head_2d/cls_head_list.2/cls_head_list.2.4/Conv', '/head_2d/cls_head_list.2/cls_head_list.2.5/Relu'], ['/head_2d/box_head_list.3/box_head_list.3.2/Conv', '/head_2d/box_head_list.3/box_head_list.3.3/Relu'], ['/head_2d/box_head_list.2/box_head_list.2.4/Conv', '/head_2d/box_head_list.2/box_head_list.2.5/Relu'], ['/head_2d/box_uncertainty_list.3/box_uncertainty_list.3.0/box_uncertainty_list.3.0.2/Conv', '/head_2d/box_uncertainty_list.3/box_uncertainty_list.3.0/box_uncertainty_list.3.0.3/Relu'], ['/head_2d/box_uncertainty_list.2/box_uncertainty_list.2.0/box_uncertainty_list.2.0.4/Conv'], ['/head_2d/cls_head_list.1/cls_head_list.1.4/Conv', '/head_2d/cls_head_list.1/cls_head_list.1.5/Relu'], ['/head_2d/cls_head/cls_head.2/Conv', '/head_2d/cls_head/cls_head.3/Relu'], ['/head_2d/box_head_list.1/box_head_list.1.4/Conv', '/head_2d/box_head_list.1/box_head_list.1.5/Relu'], ['/head_2d/box_head/box_head.2/Conv', '/head_2d/box_head/box_head.3/Relu'], ['/head_2d/box_uncertainty_list.1/box_uncertainty_list.1.0/box_uncertainty_list.1.0.4/Conv'], ['/head_2d/box_uncertainty/box_uncertainty.0/box_uncertainty.0.2/Conv', '/head_2d/box_uncertainty/box_uncertainty.0/box_uncertainty.0.3/Relu'], ['/head_2d/cls_head_list.3/cls_head_list.3.4/Conv', '/head_2d/cls_head_list.3/cls_head_list.3.5/Relu'], ['/head_2d/cls_head_list.2/cls_head_list.2.6/Conv'], ['/head_2d/box_head_list.3/box_head_list.3.4/Conv', '/head_2d/box_head_list.3/box_head_list.3.5/Relu'], ['/head_2d/box_head_list.2/box_head_list.2.6/Conv'], ['/head_2d/box_uncertainty_list.3/box_uncertainty_list.3.0/box_uncertainty_list.3.0.4/Conv'], ['/head_2d/cls_head_list.1/cls_head_list.1.6/Conv'], ['/head_2d/cls_head/cls_head.4/Conv', '/head_2d/cls_head/cls_head.5/Relu'], ['/head_2d/box_head_list.1/box_head_list.1.6/Conv'], ['/head_2d/box_head/box_head.4/Conv', '/head_2d/box_head/box_head.5/Relu'], ['/head_2d/box_uncertainty/box_uncertainty.0/box_uncertainty.0.4/Conv'], ['/head_2d/cls_head_list.3/cls_head_list.3.6/Conv'], ['/head_2d/box_head_list.3/box_head_list.3.6/Conv'], ['/head_2d/cls_head/cls_head.6/Conv'], ['/head_2d/box_head/box_head.6/Conv'], ['/interpret_2d/MatMul_2'], ['/interpret_2d/MatMul_1'], ['/interpret_2d/MatMul_3'], ['/interpret_2d/MatMul'], ['/head_3d/heads.1/layers/layers.0/fc/Gemm'], ['/head_3d/heads.0/layers/layers.0/fc/Gemm'], ['/head_3d/heads.1/layers/layers.1/fc/Gemm'], ['/head_3d/heads.0/layers/layers.1/fc/Gemm'], ['/head_3d/heads.1/layers/layers.2/fc/Gemm', '/head_3d/heads.1/layers/layers.3/Add'], ['/head_3d/heads.0/layers/layers.2/fc/Gemm']]
INFO:root:KGEN partitions: [['/pp_on_model_module/add/Add'], ['/interpret_2d/Cast_15'], ['/interpret_2d/Mul_56'], ['/interpret_2d/Mul_54'], ['/interpret_2d/Mul_59'], ['/interpret_2d/Mul_57'], ['/interpret_2d/Cast_13'], ['/interpret_2d/Add_28'], ['/interpret_2d/Mul_34'], ['/interpret_2d/Mul_32'], ['/interpret_2d/Add_30'], ['/interpret_2d/Mul_37'], ['/interpret_2d/Mul_35'], ['/interpret_2d/Cast_17'], ['/interpret_2d/Mul_78'], ['/interpret_2d/Mul_76'], ['/interpret_2d/Add_29'], ['/interpret_2d/Cast_16', '/interpret_2d/Mul_60'], ['/interpret_2d/Add_14'], ['/interpret_2d/Mul_81'], ['/interpret_2d/Mul_79'], ['/interpret_2d/Add_31'], ['/interpret_2d/Add_16'], ['/interpret_2d/Cast_5'], ['/interpret_2d/Add_42'], ['/interpret_2d/Add_15'], ['/interpret_2d/Cast_14', '/interpret_2d/Mul_38'], ['/interpret_2d/Mul_10'], ['/interpret_2d/Mul_8'], ['/interpret_2d/Mul_28'], ['/interpret_2d/Mul_26'], ['/interpret_2d/Add_44'], ['/interpret_2d/Add_17'], ['/interpret_2d/Mul_13'], ['/interpret_2d/Mul_11'], ['/interpret_2d/Add_43'], ['/interpret_2d/Cast_18', '/interpret_2d/Mul_82'], ['/interpret_2d/Add_32'], ['/interpret_2d/Add'], ['/interpret_2d/Add_12'], ['/interpret_2d/Add_45'], ['/interpret_2d/Add_2'], ['/interpret_2d/Sigmoid'], ['/interpret_2d/Add_18'], ['/interpret_2d/Add_1'], ['/interpret_2d/Cast_6', '/interpret_2d/Mul_14'], ['/interpret_2d/Add_13'], ['/interpret_2d/Add_3'], ['/interpret_2d/Add_46'], ['/interpret_2d/Exp_2'], ['/interpret_2d/Exp_1'], ['/interpret_2d/Add_4'], ['/interpret_2d/Exp_3'], ['/interpret_2d/Exp'], ['/interpret_2d/Sub_6', '/interpret_2d/Add_35', '/interpret_2d/Mul_65'], ['/interpret_2d/nms/strategy/Cast'], ['/interpret_2d/Sub_4', '/interpret_2d/Add_21', '/interpret_2d/Mul_43'], ['/interpret_2d/Sub_8', '/interpret_2d/Add_49', '/interpret_2d/Mul_87'], ['/interpret_2d/Mul_67', '/interpret_2d/Mul_68', '/interpret_2d/Add_38'], ['/interpret_2d/Mul_66'], ['/interpret_2d/nms/strategy/GreaterOrEqual'], ['/interpret_2d/Add_36'], ['/interpret_2d/Mul_45', '/interpret_2d/Mul_46', '/interpret_2d/Add_24'], ['/interpret_2d/Mul_44'], ['/interpret_2d/Sub_2', '/interpret_2d/Add_7', '/interpret_2d/Mul_21'], ['/interpret_2d/Mul_89', '/interpret_2d/Mul_90', '/interpret_2d/Add_52'], ['/interpret_2d/Mul_88'], ['/interpret_2d/Add_37'], ['/interpret_2d/Add_22'], ['/interpret_2d/Add_50'], ['/interpret_2d/Add_39', '/interpret_2d/Min_5', '/interpret_2d/Max_5'], ['/interpret_2d/Sub_7', '/interpret_2d/Min_4', '/interpret_2d/Max_4'], ['/interpret_2d/Add_23'], ['/interpret_2d/Mul_23', '/interpret_2d/Mul_24', '/interpret_2d/Add_10'], ['/interpret_2d/Mul_22'], ['/interpret_2d/Mul_69'], ['/interpret_2d/Add_51'], ['/interpret_2d/Add_25', '/interpret_2d/Min_3', '/interpret_2d/Max_3'], ['/interpret_2d/Sub_5', '/interpret_2d/Min_2', '/interpret_2d/Max_2'], ['/interpret_2d/Add_8'], ['/interpret_2d/Mul_47'], ['/interpret_2d/Add_53', '/interpret_2d/Min_7', '/interpret_2d/Max_7'], ['/interpret_2d/Sub_9', '/interpret_2d/Min_6', '/interpret_2d/Max_6'], ['/interpret_2d/Add_9'], ['/interpret_2d/Mul_91'], ['/interpret_2d/Add_11', '/interpret_2d/Min_1', '/interpret_2d/Max_1'], ['/interpret_2d/Sub_3', '/interpret_2d/Min', '/interpret_2d/Max'], ['/interpret_2d/Mul_25'], ['/interpret_2d/Sigmoid_1'], ['/interpret_2d/class_3d_mask_creator/compare/Equal', '/interpret_2d/class_3d_mask_creator/sum/Cast'], ['/interpret_2d/Less'], ['/interpret_2d/Sub_11', '/interpret_2d/Add_59', '/interpret_2d/Sqrt_1', '/interpret_2d/Abs'], ['/interpret_2d/Sub_10', '/interpret_2d/Add_58', '/interpret_2d/Sqrt', '/interpret_2d/Abs_1'], ['/interpret_2d/Add_63', '/interpret_2d/Div_4', '/interpret_2d/Atan_1'], ['/interpret_2d/Add_62', '/interpret_2d/Div_3', '/interpret_2d/Atan'], ['/interpret_2d/Sub_16'], ['/interpret_2d/Sub_15'], ['/interpret_2d/Div_8'], ['/interpret_2d/Div_7'], ['/interpret_2d/Mul_99', '/interpret_2d/Div_1', '/interpret_2d/Log', '/interpret_2d/Div_2', '/interpret_2d/Add_60', '/interpret_2d/Floor'], ['/interpret_2d/Div_6', '/interpret_2d/Pow_1', '/interpret_2d/Add_65', '/interpret_2d/Sqrt_3'], ['/interpret_2d/Div_5', '/interpret_2d/Pow', '/interpret_2d/Add_64', '/interpret_2d/Sqrt_2'], ['/interpret_2d/Mul_103'], ['/interpret_2d/Mul_102'], ['/interpret_2d/Equal_37'], ['/interpret_2d/Equal_36'], ['/interpret_2d/Equal_35'], ['/interpret_2d/Equal_34'], ['/interpret_2d/Mul_115', '/interpret_2d/Add_70'], ['/interpret_2d/Mul_113', '/interpret_2d/Add_69'], ['/interpret_2d/Mul_111', '/interpret_2d/Add_68'], ['/interpret_2d/Mul_110', '/interpret_2d/Add_67'], ['/head_3d/heads.1/process_non_backbone_features.0/Mul'], ['/head_3d/roi_align_camera.0/roi_align_layers.3/Cast'], ['/head_3d/roi_align_camera.0/roi_align_layers.2/Cast'], ['/head_3d/roi_align_camera.0/roi_align_layers.1/Cast'], ['/head_3d/roi_align_camera.0/roi_align_layers.0/Cast']]
INFO:root:Classifying the partition nodes ...
INFO:root:Selected nodes: ['/interpret_2d/Mul_66', '/interpret_2d/Mul_44', '/interpret_2d/Mul_88', '/interpret_2d/Mul_22', '/backbone/backbone/encoder/conv1/Conv', '/backbone/backbone/encoder/layer1/layer1.0/conv1/Conv', '/backbone/backbone/encoder/layer1/layer1.0/conv2/Conv', '/backbone/backbone/encoder/layer1/layer1.1/conv1/Conv', '/backbone/backbone/encoder/layer1/layer1.1/conv2/Conv', '/backbone/backbone/encoder/layer2/layer2.0/downsample/downsample.0/Conv', '/backbone/backbone/encoder/layer2/layer2.0/conv1/Conv', '/backbone/backbone/encoder/layer2/layer2.0/conv2/Conv', '/backbone/backbone/encoder/layer2/layer2.1/conv1/Conv', '/backbone/backbone/encoder/layer2/layer2.1/conv2/Conv', '/backbone/backbone/encoder/layer3/layer3.0/downsample/downsample.0/Conv', '/backbone/backbone/encoder/layer3/layer3.0/conv1/Conv', '/backbone/backbone/encoder/layer3/layer3.0/conv2/Conv', '/backbone/backbone/encoder/layer3/layer3.1/conv1/Conv', '/backbone/backbone/encoder/layer3/layer3.1/conv2/Conv', '/backbone/backbone/encoder/layer4/layer4.0/downsample/downsample.0/Conv', '/backbone/backbone/encoder/layer4/layer4.0/conv1/Conv', '/backbone/backbone/lateral4/Conv', '/backbone/backbone/encoder/layer4/layer4.0/conv2/Conv', '/backbone/backbone/encoder/layer4/layer4.1/conv1/Conv', '/backbone/backbone/encoder/layer4/layer4.1/conv2/Conv', '/backbone/backbone/pyramid6/Conv', '/backbone/backbone/lateral5/Conv', '/head_2d/cls_head_list.2/cls_head_list.2.0/Conv', '/backbone/backbone/smooth5/Conv', '/head_2d/box_head_list.2/box_head_list.2.0/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.2/conv/Conv', '/head_2d/box_uncertainty_list.2/box_uncertainty_list.2.0/box_uncertainty_list.2.0.0/Conv', '/backbone/backbone/pyramid7/Conv', '/head_2d/cls_head_list.1/cls_head_list.1.0/Conv', '/head_2d/box_head_list.1/box_head_list.1.0/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.1/conv/Conv', '/head_2d/box_uncertainty_list.1/box_uncertainty_list.1.0/box_uncertainty_list.1.0.0/Conv', '/head_2d/cls_head_list.3/cls_head_list.3.0/Conv', '/head_2d/cls_head_list.2/cls_head_list.2.2/Conv', '/backbone/backbone/smooth4/Conv', '/head_2d/box_head_list.3/box_head_list.3.0/Conv', '/head_2d/box_head_list.2/box_head_list.2.2/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.3/conv/Conv', '/head_2d/box_uncertainty_list.3/box_uncertainty_list.3.0/box_uncertainty_list.3.0.0/Conv', '/head_2d/box_uncertainty_list.2/box_uncertainty_list.2.0/box_uncertainty_list.2.0.2/Conv', '/head_2d/cls_head_list.1/cls_head_list.1.2/Conv', '/head_2d/cls_head/cls_head.0/Conv', '/head_2d/box_head_list.1/box_head_list.1.2/Conv', '/head_2d/box_head/box_head.0/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.0/conv/Conv', '/head_2d/box_uncertainty_list.1/box_uncertainty_list.1.0/box_uncertainty_list.1.0.2/Conv', '/head_2d/box_uncertainty/box_uncertainty.0/box_uncertainty.0.0/Conv', '/head_2d/cls_head_list.3/cls_head_list.3.2/Conv', '/head_2d/cls_head_list.2/cls_head_list.2.4/Conv', '/head_2d/box_head_list.3/box_head_list.3.2/Conv', '/head_2d/box_head_list.2/box_head_list.2.4/Conv', '/head_2d/box_uncertainty_list.3/box_uncertainty_list.3.0/box_uncertainty_list.3.0.2/Conv', '/head_2d/box_uncertainty_list.2/box_uncertainty_list.2.0/box_uncertainty_list.2.0.4/Conv', '/head_2d/cls_head_list.1/cls_head_list.1.4/Conv', '/head_2d/cls_head/cls_head.2/Conv', '/head_2d/box_head_list.1/box_head_list.1.4/Conv', '/head_2d/box_head/box_head.2/Conv', '/head_2d/box_uncertainty_list.1/box_uncertainty_list.1.0/box_uncertainty_list.1.0.4/Conv', '/head_2d/box_uncertainty/box_uncertainty.0/box_uncertainty.0.2/Conv', '/head_2d/cls_head_list.3/cls_head_list.3.4/Conv', '/head_2d/cls_head_list.2/cls_head_list.2.6/Conv', '/head_2d/box_head_list.3/box_head_list.3.4/Conv', '/head_2d/box_head_list.2/box_head_list.2.6/Conv', '/head_2d/box_uncertainty_list.3/box_uncertainty_list.3.0/box_uncertainty_list.3.0.4/Conv', '/head_2d/cls_head_list.1/cls_head_list.1.6/Conv', '/head_2d/cls_head/cls_head.4/Conv', '/head_2d/box_head_list.1/box_head_list.1.6/Conv', '/head_2d/box_head/box_head.4/Conv', '/head_2d/box_uncertainty/box_uncertainty.0/box_uncertainty.0.4/Conv', '/head_2d/cls_head_list.3/cls_head_list.3.6/Conv', '/head_2d/box_head_list.3/box_head_list.3.6/Conv', '/head_2d/cls_head/cls_head.6/Conv', '/head_2d/box_head/box_head.6/Conv', '/interpret_2d/MatMul_2', '/interpret_2d/MatMul_1', '/interpret_2d/MatMul_3', '/interpret_2d/MatMul', '/head_3d/heads.1/layers/layers.0/fc/Gemm', '/head_3d/heads.0/layers/layers.0/fc/Gemm', '/head_3d/heads.1/layers/layers.1/fc/Gemm', '/head_3d/heads.0/layers/layers.1/fc/Gemm', '/head_3d/heads.1/layers/layers.2/fc/Gemm', '/head_3d/heads.0/layers/layers.2/fc/Gemm', '/backbone/backbone/encoder/layer1/layer1.0/add/Add', '/backbone/backbone/encoder/layer1/layer1.1/add/Add', '/backbone/backbone/encoder/layer2/layer2.0/add/Add', '/backbone/backbone/encoder/layer2/layer2.1/add/Add', '/backbone/backbone/encoder/layer3/layer3.0/add/Add', '/backbone/backbone/encoder/layer3/layer3.1/add/Add', '/backbone/backbone/add_5_4/Add', '/backbone/backbone/encoder/layer4/layer4.0/add/Add', '/backbone/backbone/encoder/layer4/layer4.1/add/Add', '/interpret_2d/Clip_5', '/interpret_2d/Clip_4', '/interpret_2d/Clip_3', '/interpret_2d/Clip_2', '/interpret_2d/Clip_7', '/interpret_2d/Clip_6', '/interpret_2d/Clip_1', '/interpret_2d/Clip', '/interpret_2d/Clip_9', '/interpret_2d/Clip_8', '/interpret_2d/Clip_10', '/backbone/backbone/encoder/maxpool/MaxPool']
INFO:root:Total number of nodes: 751
INFO:root:Skipped node count: 0
INFO:root:Skipped nodes: []
WARNING:root:Please consider to run pre-processing before quantization. Refer to example: https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md 
Collecting tensor data and making histogram ...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 294/294 [00:01<00:00, 172.33it/s]
Finding optimal threshold for each tensor using 'entropy' algorithm ...
Number of tensors : 294
Number of histogram bins : 128 (The number may increase depends on the data it collects)
Number of quantized bins : 128
WARNING:root:Please consider pre-processing before quantization. See https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md 
INFO:root:Deleting QDQ nodes from marked inputs to make certain operations fusible ...
INFO:root:Quantized onnx model is saved as /data2/projects/camera_detection_resnet18_q4_fisheye_uncertainty_fe_od_reduced/acp4/qaic/fp16/camera_detection_resnet18_q4_fisheye_uncertainty_fe_od_reduced.onnx
INFO:root:Quantized nodes: ['/backbone/backbone/encoder/conv1/Conv', '/backbone/backbone/encoder/maxpool/MaxPool', '/backbone/backbone/encoder/layer1/layer1.0/conv1/Conv', '/backbone/backbone/encoder/layer1/layer1.0/conv2/Conv', '/backbone/backbone/encoder/layer1/layer1.0/add/Add', '/backbone/backbone/encoder/layer1/layer1.1/conv1/Conv', '/backbone/backbone/encoder/layer1/layer1.1/conv2/Conv', '/backbone/backbone/encoder/layer1/layer1.1/add/Add', '/backbone/backbone/encoder/layer2/layer2.0/downsample/downsample.0/Conv', '/backbone/backbone/encoder/layer2/layer2.0/conv1/Conv', '/backbone/backbone/encoder/layer2/layer2.0/conv2/Conv', '/backbone/backbone/encoder/layer2/layer2.0/add/Add', '/backbone/backbone/encoder/layer2/layer2.1/conv1/Conv', '/backbone/backbone/encoder/layer2/layer2.1/conv2/Conv', '/backbone/backbone/encoder/layer2/layer2.1/add/Add', '/backbone/backbone/encoder/layer3/layer3.0/downsample/downsample.0/Conv', '/backbone/backbone/encoder/layer3/layer3.0/conv1/Conv', '/backbone/backbone/encoder/layer3/layer3.0/conv2/Conv', '/backbone/backbone/encoder/layer3/layer3.0/add/Add', '/backbone/backbone/encoder/layer3/layer3.1/conv1/Conv', '/backbone/backbone/encoder/layer3/layer3.1/conv2/Conv', '/backbone/backbone/encoder/layer3/layer3.1/add/Add', '/backbone/backbone/encoder/layer4/layer4.0/downsample/downsample.0/Conv', '/backbone/backbone/encoder/layer4/layer4.0/conv1/Conv', '/backbone/backbone/lateral4/Conv', '/backbone/backbone/encoder/layer4/layer4.0/conv2/Conv', '/backbone/backbone/encoder/layer4/layer4.0/add/Add', '/backbone/backbone/encoder/layer4/layer4.1/conv1/Conv', '/backbone/backbone/encoder/layer4/layer4.1/conv2/Conv', '/backbone/backbone/encoder/layer4/layer4.1/add/Add', '/backbone/backbone/pyramid6/Conv', '/backbone/backbone/lateral5/Conv', '/head_2d/cls_head_list.2/cls_head_list.2.0/Conv', '/head_2d/box_head_list.2/box_head_list.2.0/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.2/conv/Conv', '/head_2d/box_uncertainty_list.2/box_uncertainty_list.2.0/box_uncertainty_list.2.0.0/Conv', '/backbone/backbone/smooth5/Conv', '/backbone/backbone/interpolate_5_4/Resize', '/backbone/backbone/pyramid7/Conv', '/head_2d/cls_head_list.1/cls_head_list.1.0/Conv', '/head_2d/box_head_list.1/box_head_list.1.0/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.1/conv/Conv', '/head_2d/box_uncertainty_list.1/box_uncertainty_list.1.0/box_uncertainty_list.1.0.0/Conv', '/backbone/backbone/add_5_4/Add', '/head_2d/cls_head_list.3/cls_head_list.3.0/Conv', '/head_2d/box_head_list.3/box_head_list.3.0/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.3/conv/Conv', '/head_2d/box_uncertainty_list.3/box_uncertainty_list.3.0/box_uncertainty_list.3.0.0/Conv', '/head_2d/cls_head_list.2/cls_head_list.2.2/Conv', '/head_2d/box_head_list.2/box_head_list.2.2/Conv', '/head_2d/box_uncertainty_list.2/box_uncertainty_list.2.0/box_uncertainty_list.2.0.2/Conv', '/backbone/backbone/smooth4/Conv', '/head_2d/cls_head_list.1/cls_head_list.1.2/Conv', '/head_2d/box_head_list.1/box_head_list.1.2/Conv', '/head_2d/box_uncertainty_list.1/box_uncertainty_list.1.0/box_uncertainty_list.1.0.2/Conv', '/head_2d/cls_head_list.3/cls_head_list.3.2/Conv', '/head_2d/box_head_list.3/box_head_list.3.2/Conv', '/head_2d/box_uncertainty_list.3/box_uncertainty_list.3.0/box_uncertainty_list.3.0.2/Conv', '/head_2d/cls_head_list.2/cls_head_list.2.4/Conv', '/head_2d/box_head_list.2/box_head_list.2.4/Conv', '/head_2d/box_uncertainty_list.2/box_uncertainty_list.2.0/box_uncertainty_list.2.0.4/Conv', '/head_2d/cls_head/cls_head.0/Conv', '/head_2d/box_head/box_head.0/Conv', '/head_3d/roi_align_camera.0/extract_rois_params/conv_new.0/conv/Conv', '/head_2d/box_uncertainty/box_uncertainty.0/box_uncertainty.0.0/Conv', '/head_2d/cls_head_list.1/cls_head_list.1.4/Conv', '/head_2d/box_head_list.1/box_head_list.1.4/Conv', '/head_2d/box_uncertainty_list.1/box_uncertainty_list.1.0/box_uncertainty_list.1.0.4/Conv', '/head_2d/cls_head_list.3/cls_head_list.3.4/Conv', '/head_2d/box_head_list.3/box_head_list.3.4/Conv', '/head_2d/box_uncertainty_list.3/box_uncertainty_list.3.0/box_uncertainty_list.3.0.4/Conv', '/head_2d/cls_head_list.2/cls_head_list.2.6/Conv', '/head_2d/box_head_list.2/box_head_list.2.6/Conv', '/head_2d/cls_head/cls_head.2/Conv', '/head_2d/box_head/box_head.2/Conv', '/head_2d/box_uncertainty/box_uncertainty.0/box_uncertainty.0.2/Conv', '/head_2d/cls_head_list.1/cls_head_list.1.6/Conv', '/head_2d/box_head_list.1/box_head_list.1.6/Conv', '/head_2d/cls_head_list.3/cls_head_list.3.6/Conv', '/head_2d/box_head_list.3/box_head_list.3.6/Conv', '/head_2d/cls_head/cls_head.4/Conv', '/head_2d/box_head/box_head.4/Conv', '/head_2d/box_uncertainty/box_uncertainty.0/box_uncertainty.0.4/Conv', '/head_2d/cls_head/cls_head.6/Conv', '/head_2d/box_head/box_head.6/Conv', '/interpret_2d/Mul_67', '/interpret_2d/Mul_66', '/interpret_2d/Mul_65', '/interpret_2d/Concat_60', '/interpret_2d/Mul_45', '/interpret_2d/Mul_44', '/interpret_2d/Mul_43', '/interpret_2d/Mul_89', '/interpret_2d/Mul_88', '/interpret_2d/Mul_87', '/interpret_2d/Concat_48', '/interpret_2d/Concat_72', '/interpret_2d/MatMul_2', '/interpret_2d/Mul_23', '/interpret_2d/Mul_22', '/interpret_2d/Mul_21', '/interpret_2d/Concat_30', '/interpret_2d/MatMul_1', '/interpret_2d/MatMul_3', '/interpret_2d/MatMul', '/head_3d/heads.1/layers/layers.0/fc/Gemm', '/head_3d/heads.0/layers/layers.0/fc/Gemm', '/head_3d/heads.1/layers/layers.1/fc/Gemm', '/head_3d/heads.0/layers/layers.1/fc/Gemm', '/head_3d/heads.1/layers/layers.2/fc/Gemm', '/head_3d/heads.0/layers/layers.2/fc/Gemm']
INFO:root:Total number of quantized nodes: 111
INFO:root:Quantized node types: {'Conv', 'MatMul', 'Gemm', 'Resize', 'MaxPool', 'Mul', 'Concat', 'Add'}

still getting the warning..

but when im trying to build the trt engine im getting

[07/24/2024-11:17:48] [E] [TRT] ModelImporter.cpp:828: While parsing node number 177 [ScatterND -> "/interpret_2d/nms/strategy/ScatterND_output_0"]:
[07/24/2024-11:17:48] [E] [TRT] ModelImporter.cpp:831: --- Begin node ---
input: "/interpret_2d/nms/strategy/Constant_17_output_0"
input: "/interpret_2d/nms/strategy/Constant_19_output_0"
input: "/interpret_2d/nms/strategy/Reshape_3_output_0"
output: "/interpret_2d/nms/strategy/ScatterND_output_0"
name: "/interpret_2d/nms/strategy/ScatterND"
op_type: "ScatterND"
attribute {
  name: "reduction"
  s: "none"
  type: STRING
}

[07/24/2024-11:17:48] [E] [TRT] ModelImporter.cpp:832: --- End node ---
[07/24/2024-11:17:48] [E] [TRT] ModelImporter.cpp:836: ERROR: onnxOpImporters.cpp:5119 In function importScatterND:
[9] Assertion failed: !attrs.count("reduction"): Attribute reduction is not supported.
[07/24/2024-11:17:48] [E] Failed to parse onnx file
[07/24/2024-11:17:48] [I] Finished parsing network model. Parse time: 0.274777
[07/24/2024-11:17:48] [E] Parsing model failed
[07/24/2024-11:17:48] [E] Failed to create engine from model or file.
[07/24/2024-11:17:48] [E] Engine set up failed

any help will be appreciated. Thanks

To reproduce

detailed in the description

Urgency

Blocking me from quantizing model as per ONNX recommendations

Platform

Linux

OS Version

Ubuntu 22.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

onnxruntime-gpu 1.18.1

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

CUDA 11.8

microsoft / onnxruntime