[Bug] When converting to TensorRT via partitioning a CascadeRCNN

octavflorescu commented 1 year ago

Checklist

[X] I have searched related issues but cannot get the expected help.
[X] 2. I have read the FAQ documentation but cannot get the expected help.
[X] 3. The bug has not been fixed in the latest version.

Describe the bug

I am trying to convert a CascadeRCNN to TensorRT via partitioning (i am trying to extract the fpn output embeddings at predict time), i am using the deploy.py script, both partitions are converted to ONNX, but when converting the second partition to TensorRT, i enounter this error:

(parseGraph): INVALID_GRAPH: Assertion failed: toposort(graph.node), &topoOrder) && "Failed to sort the model topologically."

the partitioning config:

_base_ = ['../_base_/base_tensorrt-fp16_static-1920x1920.py']

onnx_config = dict(
    dynamic_axes={
        'input': {
            0: 'batch',
        },
        'dets': {
            0: 'batch',
            1: 'num_dets',
        },
        'labels': {
            0: 'batch',
            1: 'num_dets',
        },
        'bbox_feats': {
            0: 'batch'
        },
        'cls_score': {
            0: 'batch'
        },
        'bbox_pred': {
            0: 'batch'
        },
    }, )

partition_config = dict(
    type='two_stage', # the partition policy name
    apply_marks=True, # should always be set to True
    partition_cfg=[
        dict(
            save_file='backbone2fpn.onnx', # filename to save the partitioned onnx model
            start=['detector_forward:input'], # [mark_name:input/output, ...]
            end=['extract_feat:output'],  # [mark_name:input/output, ...]
            output_names=['feat'] # output names
        ),
        dict(
            save_file='fpn2end.onnx', # filename to save the partitioned onnx model
            start=['roi_extractor:output'],
            end=['bbox_head_forward:output'],
            # start=['detector_forward:input'], # [mark_name:input/output, ...]
            # end=['multiclass_nms:output'],  # [mark_name:input/output, ...]
            # output_names=['dets', 'labels'], # output names
            output_names=['cls', 'bbox']
        ),
    ])

Reproduction

python tools/deploy.py [...]/libraries/mmdeploy/configs/mmdet/detection/detection_tensorrt_fpn_partitioned_static-1920x1920.py [...]/models/detector/cascade_rcnn_resnext50_32x4d_fpn_ga_gn_fp16_x2_2x1/model.py [...]/models/detector/cascade_rcnn_resnext50_32x4d_fpn_ga_gn_fp16_x2_2x1/checkpoints/epoch_4.pth [...]/gt_images/100.png --work-dir [...]/base_test --device cuda --dump-info

Environment

2023-02-06 11:45:32,239 - mmdeploy - INFO - **********Environmental information**********
fatal: not a git repository (or any of the parent directories): .git
2023-02-06 11:45:32,874 - mmdeploy - INFO - sys.platform: linux
2023-02-06 11:45:32,874 - mmdeploy - INFO - Python: 3.8.13 (default, Dec 16 2022, 08:32:30) [GCC 7.5.0]
2023-02-06 11:45:32,875 - mmdeploy - INFO - CUDA available: True
2023-02-06 11:45:32,875 - mmdeploy - INFO - GPU 0: Tesla V100-PCIE-16GB
2023-02-06 11:45:32,875 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
2023-02-06 11:45:32,875 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.109
2023-02-06 11:45:32,875 - mmdeploy - INFO - GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
2023-02-06 11:45:32,875 - mmdeploy - INFO - PyTorch: 1.12.0+cu113
2023-02-06 11:45:32,875 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.3.2  (built against CUDA 11.5)
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

2023-02-06 11:45:32,875 - mmdeploy - INFO - TorchVision: 0.13.0+cu113
2023-02-06 11:45:32,875 - mmdeploy - INFO - OpenCV: 4.7.0
2023-02-06 11:45:32,875 - mmdeploy - INFO - MMCV: 1.7.0
2023-02-06 11:45:32,875 - mmdeploy - INFO - MMCV Compiler: GCC 9.3
2023-02-06 11:45:32,875 - mmdeploy - INFO - MMCV CUDA Compiler: 11.3
2023-02-06 11:45:32,875 - mmdeploy - INFO - MMDeploy: 0.12.0+
2023-02-06 11:45:32,875 - mmdeploy - INFO -

2023-02-06 11:45:32,875 - mmdeploy - INFO - **********Backend information**********
Traceback (most recent call last):
  File "tools/check_env.py", line 71, in <module>
    check_backend()
  File "tools/check_env.py", line 28, in check_backend
    logger.info(f'onnxruntime: {ort_version}\tops_is_avaliable : '
AttributeError: module 'mmdeploy.apis.onnxruntime' has no attribute 'is_custom_ops_available'

root@5c0c4432940d:/home/python_modules/libraries/mmdeploy# echo $PATH
/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/cuda/bin:/bin:/usr/local/cuda/bin:/bin
root@5c0c4432940d:/home/python_modules/libraries/mmdeploy# echo $LD_LIBRARY_PATH
/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64



### Error traceback

_No response_

AllentDan commented 1 year ago

Hi, @octavflorescu. It seems that you used a customized model config and deployment config. Could you please share your partitioned onnx model? And if possible, share the content of /models/detector/cascade_rcnn_resnext50_32x4d_fpn_ga_gn_fp16_x2_2x1/model.py.

octavflorescu commented 1 year ago

hi @AllentDan
here it is the model:

fp16 = dict(loss_scale=512.0)
norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
model = dict(
    type='CascadeRCNN',
    pretrained='open-mmlab://resnext50_32x4d',
    backbone=dict(
        type='ResNeXt',
        depth=50,
        groups=32,
        base_width=4,
        num_stages=4,
        out_indices=(0, 1, 2, 3),
        frozen_stages=1,
        norm_cfg=dict(type='BN', requires_grad=True),
        style='pytorch',
        plugins=[
            dict(
                cfg=dict(
                    type='GeneralizedAttention',
                    spatial_range=-1,
                    num_heads=8,
                    attention_type='0010',
                    kv_stride=2),
                stages=(False, False, True, True),
                position='after_conv2')
        ],
        dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
        stage_with_dcn=(False, True, True, True)),
    neck=dict(
        type='FPN',
        in_channels=[256, 512, 1024, 2048],
        out_channels=256,
        num_outs=5,
        norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)),
    rpn_head=dict(
        type='RPNHead',
        in_channels=256,
        feat_channels=256,
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[0.0, 0.0, 0.0, 0.0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
        loss_bbox=dict(
            type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
    roi_head=dict(
        type='CascadeRoIHead',
        num_stages=3,
        stage_loss_weights=[1, 0.5, 0.25],
        bbox_roi_extractor=dict(
            type='SingleRoIExtractor',
            roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
            out_channels=256,
            featmap_strides=[4, 8, 16, 32]),
        bbox_head=[
            dict(
                type='Shared2FCBBoxHead',
                in_channels=256,
                fc_out_channels=1024,
                roi_feat_size=7,
                num_classes=69,
                bbox_coder=dict(
                    type='DeltaXYWHBBoxCoder',
                    target_means=[0.0, 0.0, 0.0, 0.0],
                    target_stds=[0.05, 0.05, 0.1, 0.1]),
                reg_class_agnostic=True,
                loss_cls=dict(
                    type='CrossEntropyLoss',
                    use_sigmoid=False,
                    loss_weight=1.0),
                loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0),
                norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)),
            dict(
                type='Shared2FCBBoxHead',
                in_channels=256,
                fc_out_channels=1024,
                roi_feat_size=7,
                num_classes=69,
                bbox_coder=dict(
                    type='DeltaXYWHBBoxCoder',
                    target_means=[0.0, 0.0, 0.0, 0.0],
                    target_stds=[0.025, 0.025, 0.05, 0.05]),
                reg_class_agnostic=True,
                loss_cls=dict(
                    type='CrossEntropyLoss',
                    use_sigmoid=False,
                    loss_weight=1.0),
                loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0),
                norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)),
            dict(
                type='Shared2FCBBoxHead',
                in_channels=256,
                fc_out_channels=1024,
                roi_feat_size=7,
                num_classes=69,
                bbox_coder=dict(
                    type='DeltaXYWHBBoxCoder',
                    target_means=[0.0, 0.0, 0.0, 0.0],
                    target_stds=[0.0165, 0.0165, 0.033, 0.033]),
                reg_class_agnostic=True,
                loss_cls=dict(
                    type='CrossEntropyLoss',
                    use_sigmoid=False,
                    loss_weight=1.0),
                loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0),
                norm_cfg=dict(type='GN', num_groups=32, requires_grad=True))
        ]),
    test_cfg = dict(
        rpn=dict(
            nms_pre=1000,
            max_per_img=1000,
            nms=dict(type='nms', iou_threshold=0.7),
            min_bbox_size=0),
        rcnn=dict(
            score_thr=0.05,
            nms=dict(type='nms', iou_threshold=0.5),
            max_per_img=100)
    )
)

This is the deployment config:

_base_ = ['./base_static.py', '../../_base_/backends/tensorrt-fp16.py']

backend_config = dict(
    common_config=dict(max_workspace_size=1 << 35),
    model_inputs=[
        dict(
            input_shapes=dict(
                input=dict(
                    min_shape=[1, 3, 1920, 1920],
                    opt_shape=[1, 3, 1920, 1920],
                    max_shape=[1, 3, 1920, 1920]))),
        dict(
            input_shapes=dict(
                input=dict(
                    min_shape=[1, 256, 7, 7], # i have tried also adding another dim for batch [1, 1, 256, 7, 7]
                    opt_shape=[10, 256, 7, 7],
                    max_shape=[1000, 256, 7, 7])))
    ])

onnx_config = dict(
    input_shape=(1920, 1920),
    dynamic_axes={
        'input': {
            0: 'batch',
            # 2: 'height',
            # 3: 'width'
        },
        'dets': {
            0: 'batch',
            1: 'num_dets',
        },
        'labels': {
            0: 'batch',
            1: 'num_dets',
        },
        'bbox_feats': {
            0: 'batch'
        },
        'cls_score': {
            0: 'batch'
        },
        'bbox_pred': {
            0: 'batch'
        },
    }, )

partition_config = dict(
    type='two_stage', # the partition policy name
    apply_marks=True, # should always be set to True
    partition_cfg=[
        dict(
            save_file='backbone2fpn.onnx', # filename to save the partitioned onnx model
            start=['detector_forward:input'], # [mark_name:input/output, ...]
            end=['extract_feat:output'],  # [mark_name:input/output, ...]
            output_names=['feat'] # output names
        ),
        dict(
            save_file='fpn2end.onnx', # filename to save the partitioned onnx model
            start=['roi_extractor:output'],
            end=['bbox_head_forward:output'],
            output_names=['cls', 'bbox']
        ),
    ])

if i configure the onnx_config to dynamic, (although the rest of the model is static), i get the error above; if i don't configure it as dynamic, it does not reach deployment, it stops at 2onnx conversion:

  File "/usr/local/lib/python3.8/site-packages/onnx/utils.py", line 15, in __init__
    self.model = onnx.shape_inference.infer_shapes(model)
  File "/usr/local/lib/python3.8/site-packages/onnx/shape_inference.py", line 34, in infer_shapes
    inferred_model_str = C.infer_shapes(model_str, check_type, strict_mode, data_prop)
onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] (op_type:Div, node name: Div_1284): [ShapeInferenceError] Inferred shape and existing shape differ in rank: (3) vs (2)

AllentDan commented 1 year ago

Well, I can convert the partitioned models to TensorRT successfully.

And here is my env:

2023-02-14 14:04:42,771 - mmdeploy - INFO - sys.platform: linux
2023-02-14 14:04:42,771 - mmdeploy - INFO - Python: 3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0]
2023-02-14 14:04:42,771 - mmdeploy - INFO - CUDA available: True
2023-02-14 14:04:42,771 - mmdeploy - INFO - GPU 0: NVIDIA GeForce GTX 1660 SUPER
2023-02-14 14:04:42,771 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
2023-02-14 14:04:42,771 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.58
2023-02-14 14:04:42,771 - mmdeploy - INFO - GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
2023-02-14 14:04:42,771 - mmdeploy - INFO - PyTorch: 1.10.2
2023-02-14 14:04:42,771 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

2023-02-14 14:04:42,771 - mmdeploy - INFO - TorchVision: 0.11.0a0
2023-02-14 14:04:42,771 - mmdeploy - INFO - OpenCV: 4.5.4
2023-02-14 14:04:42,771 - mmdeploy - INFO - MMCV: 1.5.0
2023-02-14 14:04:42,771 - mmdeploy - INFO - MMCV Compiler: GCC 7.5
2023-02-14 14:04:42,771 - mmdeploy - INFO - MMCV CUDA Compiler: 11.3
2023-02-14 14:04:42,771 - mmdeploy - INFO - MMDeploy: 0.12.0+5fdf003
2023-02-14 14:04:42,771 - mmdeploy - INFO - 

2023-02-14 14:04:42,771 - mmdeploy - INFO - **********Backend information**********
2023-02-14 14:04:42,816 - mmdeploy - INFO - tensorrt:   8.4.1.5
2023-02-14 14:04:42,816 - mmdeploy - INFO - tensorrt custom ops:        Available
2023-02-14 14:04:42,871 - mmdeploy - INFO - ONNXRuntime:        None
2023-02-14 14:04:42,871 - mmdeploy - INFO - ONNXRuntime-gpu:    1.8.1
2023-02-14 14:04:42,871 - mmdeploy - INFO - ONNXRuntime custom ops:     Available
2023-02-14 14:04:42,872 - mmdeploy - INFO - pplnn:      None
2023-02-14 14:04:42,875 - mmdeploy - INFO - ncnn:       None
2023-02-14 14:04:42,877 - mmdeploy - INFO - snpe:       None
2023-02-14 14:04:42,878 - mmdeploy - INFO - openvino:   None
2023-02-14 14:04:42,880 - mmdeploy - INFO - torchscript:        1.10.2
2023-02-14 14:04:42,880 - mmdeploy - INFO - torchscript custom ops:     Available
2023-02-14 14:04:42,959 - mmdeploy - INFO - rknn-toolkit:       None
2023-02-14 14:04:42,959 - mmdeploy - INFO - rknn2-toolkit:      None
2023-02-14 14:04:42,960 - mmdeploy - INFO - ascend:     None
2023-02-14 14:04:42,962 - mmdeploy - INFO - coreml:     None
2023-02-14 14:04:42,963 - mmdeploy - INFO - tvm:        None
2023-02-14 14:04:42,963 - mmdeploy - INFO - 

2023-02-14 14:04:42,963 - mmdeploy - INFO - **********Codebase information**********
2023-02-14 14:04:43,623 - mmdeploy - INFO - mmdet:      2.19.0
2023-02-14 14:04:43,623 - mmdeploy - INFO - mmseg:      None
2023-02-14 14:04:43,623 - mmdeploy - INFO - mmcls:      0.19.0
2023-02-14 14:04:43,623 - mmdeploy - INFO - mmocr:      0.4.1
2023-02-14 14:04:43,624 - mmdeploy - INFO - mmedit:     None
2023-02-14 14:04:43,624 - mmdeploy - INFO - mmdet3d:    None
2023-02-14 14:04:43,624 - mmdeploy - INFO - mmpose:     None
2023-02-14 14:04:43,624 - mmdeploy - INFO - mmrotate:   None
2023-02-14 14:04:43,624 - mmdeploy - INFO - mmaction:   None

octavflorescu commented 1 year ago

hi, i have come back to this...
if i try and run the mdoel conversion, it ends here:

  File "/usr/local/lib/python3.8/site-packages/mmdeploy/codebase/mmdet/models/detectors/two_stage.py", line 58, in two_stage_detector__simple_test
    proposals, _ = self.rpn_head.simple_test_rpn(x, img_metas)
ValueError: not enough values to unpack (expected 2, got 1)
2023-07-04 18:15:24,536 - mmdeploy - ERROR - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.

How does it work for you? i see that mmdet is 2.19 while mmdeploy is 0.12-somerev. in mmdet 2.19, mmdet/models/detectors/two_stage.py, simple_test_rpn is called as such: (which is different from mmdeploy's implementation)

proposal_list = self.rpn_head.simple_test_rpn(x, img_metas)

do i have to manually build the mmdeploy lib? and link my mmdet lib? isn't there a mmdeploy lib already built for mmdet 2.25? or at least 2.19?

AllentDan commented 1 year ago

Maybe you can try the prebuilt package v0.5.0 or v0.6.0. We provide a prebuilt package since v0.5.0.

ZeroRegister commented 6 months ago

I have a similar problem.I want convert my rtmpose model to onnx, but when I convert it, I get a mistake. "ERROR - not enough values to unpack (expected 2, got 1)", it's like this:

!python tools/deploy.py \
        configs/mmpose/pose-detection_onnxruntime-fp16_static.py \
        ConvertFolder/rtmpose-s-Ear.py \
        ConvertFolder/rtm_pose.pth \
        ConvertFolder/DSC_5384.jpg \
        --work-dir ConvertFolder/mmpose2onnx_rtmpose2 \
        --dump-info

error:

05/14 21:03:05 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:05 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "mmpose_tasks" registry tree. As a workaround, the current "mmpose_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:07 - mmengine - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
05/14 21:03:09 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:09 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "mmpose_tasks" registry tree. As a workaround, the current "mmpose_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
Loads checkpoint by local backend from path: ConvertFolder/rtm_pose.pth
05/14 21:03:10 - mmengine - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 
05/14 21:03:10 - mmengine - INFO - Export PyTorch model to ONNX: ConvertFolder/mmpose2onnx_rtmpose2\end2end.onnx.
05/14 21:03:10 - mmengine - WARNING - Can not find torch._C._jit_pass_onnx_autograd_function_process, function rewrite will not be applied
05/14 21:03:10 - mmengine - WARNING - Can not find models.yolox_pose_head.YOLOXPoseHead.predict, function rewrite will not be applied
05/14 21:03:10 - mmengine - WARNING - Can not find models.yolox_pose_head.YOLOXPoseHead.predict_by_feat, function rewrite will not be applied
05/14 21:03:12 - mmengine - INFO - Execute onnx optimize passes.
05/14 21:03:12 - mmengine - WARNING - Can not optimize model, please build torchscipt extension.
More details: https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/experimental/onnx_optimizer.md
05/14 21:03:12 - mmengine - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx
05/14 21:03:13 - mmengine - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in main process
05/14 21:03:13 - mmengine - INFO - Finish pipeline mmdeploy.apis.utils.utils.to_backend
05/14 21:03:13 - mmengine - INFO - visualize onnxruntime model start.
05/14 21:03:16 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:16 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "mmpose_tasks" registry tree. As a workaround, the current "mmpose_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:16 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "backend_segmentors" registry tree. As a workaround, the current "backend_segmentors" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:16 - mmengine - INFO - Successfully loaded onnxruntime custom ops from [d:\Applications\Programming\miniconda3\envs\openmmlab\lib\site-packages\mmdeploy\lib\mmdeploy_onnxruntime_ops.dll](file:///D:/Applications/Programming/miniconda3/envs/openmmlab/lib/site-packages/mmdeploy/lib/mmdeploy_onnxruntime_ops.dll)
Traceback (most recent call last):
  File "d:\Applications\Programming\miniconda3\envs\openmmlab\lib\site-packages\mmdeploy\utils\utils.py", line 41, in target_wrapper
    result = target(*args, **kwargs)
...
  File "d:\Applications\Programming\miniconda3\envs\openmmlab\lib\site-packages\mmdeploy\codebase\mmpose\deploy\pose_detection_model.py", line 108, in forward
    batch_pred_x, batch_pred_y = batch_outputs
ValueError: not enough values to unpack (expected 2, got 1)
05/14 21:03:17 - mmengine - ERROR - tools/deploy.py - create_process - 82 - visualize onnxruntime model failed.
Output is truncated. View as a [scrollable element](command:cellOutput.enableScrolling?c87e4cfb-ce15-40ab-abff-dce93395543e) or open in a [text editor](command:workbench.action.openLargeOutput?c87e4cfb-ce15-40ab-abff-dce93395543e). Adjust cell output [settings](command:workbench.action.openSettings?%5B%22%40tag%3AnotebookOutputLayout%22%5D)...
[d:\Applications\Programming\miniconda3\envs\openmmlab\lib\site-packages\torch\autocast_mode.py:141](file:///D:/Applications/Programming/miniconda3/envs/openmmlab/lib/site-packages/torch/autocast_mode.py:141): UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
2024-05-14:21:03:17 - root - ERROR - not enough values to unpack (expected 2, got 1)

open-mmlab / mmdeploy