Open octavflorescu opened 1 year ago
Hi, @octavflorescu. It seems that you used a customized model config and deployment config. Could you please share your partitioned onnx model? And if possible, share the content of /models/detector/cascade_rcnn_resnext50_32x4d_fpn_ga_gn_fp16_x2_2x1/model.py
.
hi @AllentDan
here it is the model:
fp16 = dict(loss_scale=512.0)
norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
model = dict(
type='CascadeRCNN',
pretrained='open-mmlab://resnext50_32x4d',
backbone=dict(
type='ResNeXt',
depth=50,
groups=32,
base_width=4,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
style='pytorch',
plugins=[
dict(
cfg=dict(
type='GeneralizedAttention',
spatial_range=-1,
num_heads=8,
attention_type='0010',
kv_stride=2),
stages=(False, False, True, True),
position='after_conv2')
],
dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
stage_with_dcn=(False, True, True, True)),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5,
norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
roi_head=dict(
type='CascadeRoIHead',
num_stages=3,
stage_loss_weights=[1, 0.5, 0.25],
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=[
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=69,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.05, 0.05, 0.1, 0.1]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0),
norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=69,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.025, 0.025, 0.05, 0.05]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0),
norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)),
dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=69,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.0165, 0.0165, 0.033, 0.033]),
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0),
norm_cfg=dict(type='GN', num_groups=32, requires_grad=True))
]),
test_cfg = dict(
rpn=dict(
nms_pre=1000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100)
)
)
This is the deployment config:
_base_ = ['./base_static.py', '../../_base_/backends/tensorrt-fp16.py']
backend_config = dict(
common_config=dict(max_workspace_size=1 << 35),
model_inputs=[
dict(
input_shapes=dict(
input=dict(
min_shape=[1, 3, 1920, 1920],
opt_shape=[1, 3, 1920, 1920],
max_shape=[1, 3, 1920, 1920]))),
dict(
input_shapes=dict(
input=dict(
min_shape=[1, 256, 7, 7], # i have tried also adding another dim for batch [1, 1, 256, 7, 7]
opt_shape=[10, 256, 7, 7],
max_shape=[1000, 256, 7, 7])))
])
onnx_config = dict(
input_shape=(1920, 1920),
dynamic_axes={
'input': {
0: 'batch',
# 2: 'height',
# 3: 'width'
},
'dets': {
0: 'batch',
1: 'num_dets',
},
'labels': {
0: 'batch',
1: 'num_dets',
},
'bbox_feats': {
0: 'batch'
},
'cls_score': {
0: 'batch'
},
'bbox_pred': {
0: 'batch'
},
}, )
partition_config = dict(
type='two_stage', # the partition policy name
apply_marks=True, # should always be set to True
partition_cfg=[
dict(
save_file='backbone2fpn.onnx', # filename to save the partitioned onnx model
start=['detector_forward:input'], # [mark_name:input/output, ...]
end=['extract_feat:output'], # [mark_name:input/output, ...]
output_names=['feat'] # output names
),
dict(
save_file='fpn2end.onnx', # filename to save the partitioned onnx model
start=['roi_extractor:output'],
end=['bbox_head_forward:output'],
output_names=['cls', 'bbox']
),
])
if i configure the onnx_config to dynamic, (although the rest of the model is static), i get the error above; if i don't configure it as dynamic, it does not reach deployment, it stops at 2onnx conversion:
File "/usr/local/lib/python3.8/site-packages/onnx/utils.py", line 15, in __init__
self.model = onnx.shape_inference.infer_shapes(model)
File "/usr/local/lib/python3.8/site-packages/onnx/shape_inference.py", line 34, in infer_shapes
inferred_model_str = C.infer_shapes(model_str, check_type, strict_mode, data_prop)
onnx.onnx_cpp2py_export.shape_inference.InferenceError: [ShapeInferenceError] (op_type:Div, node name: Div_1284): [ShapeInferenceError] Inferred shape and existing shape differ in rank: (3) vs (2)
Well, I can convert the partitioned models to TensorRT successfully.
And here is my env:
2023-02-14 14:04:42,771 - mmdeploy - INFO - sys.platform: linux
2023-02-14 14:04:42,771 - mmdeploy - INFO - Python: 3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0]
2023-02-14 14:04:42,771 - mmdeploy - INFO - CUDA available: True
2023-02-14 14:04:42,771 - mmdeploy - INFO - GPU 0: NVIDIA GeForce GTX 1660 SUPER
2023-02-14 14:04:42,771 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
2023-02-14 14:04:42,771 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.58
2023-02-14 14:04:42,771 - mmdeploy - INFO - GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
2023-02-14 14:04:42,771 - mmdeploy - INFO - PyTorch: 1.10.2
2023-02-14 14:04:42,771 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 11.3
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
- CuDNN 8.2
- Magma 2.5.2
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
2023-02-14 14:04:42,771 - mmdeploy - INFO - TorchVision: 0.11.0a0
2023-02-14 14:04:42,771 - mmdeploy - INFO - OpenCV: 4.5.4
2023-02-14 14:04:42,771 - mmdeploy - INFO - MMCV: 1.5.0
2023-02-14 14:04:42,771 - mmdeploy - INFO - MMCV Compiler: GCC 7.5
2023-02-14 14:04:42,771 - mmdeploy - INFO - MMCV CUDA Compiler: 11.3
2023-02-14 14:04:42,771 - mmdeploy - INFO - MMDeploy: 0.12.0+5fdf003
2023-02-14 14:04:42,771 - mmdeploy - INFO -
2023-02-14 14:04:42,771 - mmdeploy - INFO - **********Backend information**********
2023-02-14 14:04:42,816 - mmdeploy - INFO - tensorrt: 8.4.1.5
2023-02-14 14:04:42,816 - mmdeploy - INFO - tensorrt custom ops: Available
2023-02-14 14:04:42,871 - mmdeploy - INFO - ONNXRuntime: None
2023-02-14 14:04:42,871 - mmdeploy - INFO - ONNXRuntime-gpu: 1.8.1
2023-02-14 14:04:42,871 - mmdeploy - INFO - ONNXRuntime custom ops: Available
2023-02-14 14:04:42,872 - mmdeploy - INFO - pplnn: None
2023-02-14 14:04:42,875 - mmdeploy - INFO - ncnn: None
2023-02-14 14:04:42,877 - mmdeploy - INFO - snpe: None
2023-02-14 14:04:42,878 - mmdeploy - INFO - openvino: None
2023-02-14 14:04:42,880 - mmdeploy - INFO - torchscript: 1.10.2
2023-02-14 14:04:42,880 - mmdeploy - INFO - torchscript custom ops: Available
2023-02-14 14:04:42,959 - mmdeploy - INFO - rknn-toolkit: None
2023-02-14 14:04:42,959 - mmdeploy - INFO - rknn2-toolkit: None
2023-02-14 14:04:42,960 - mmdeploy - INFO - ascend: None
2023-02-14 14:04:42,962 - mmdeploy - INFO - coreml: None
2023-02-14 14:04:42,963 - mmdeploy - INFO - tvm: None
2023-02-14 14:04:42,963 - mmdeploy - INFO -
2023-02-14 14:04:42,963 - mmdeploy - INFO - **********Codebase information**********
2023-02-14 14:04:43,623 - mmdeploy - INFO - mmdet: 2.19.0
2023-02-14 14:04:43,623 - mmdeploy - INFO - mmseg: None
2023-02-14 14:04:43,623 - mmdeploy - INFO - mmcls: 0.19.0
2023-02-14 14:04:43,623 - mmdeploy - INFO - mmocr: 0.4.1
2023-02-14 14:04:43,624 - mmdeploy - INFO - mmedit: None
2023-02-14 14:04:43,624 - mmdeploy - INFO - mmdet3d: None
2023-02-14 14:04:43,624 - mmdeploy - INFO - mmpose: None
2023-02-14 14:04:43,624 - mmdeploy - INFO - mmrotate: None
2023-02-14 14:04:43,624 - mmdeploy - INFO - mmaction: None
hi, i have come back to this...
if i try and run the mdoel conversion, it ends here:
File "/usr/local/lib/python3.8/site-packages/mmdeploy/codebase/mmdet/models/detectors/two_stage.py", line 58, in two_stage_detector__simple_test
proposals, _ = self.rpn_head.simple_test_rpn(x, img_metas)
ValueError: not enough values to unpack (expected 2, got 1)
2023-07-04 18:15:24,536 - mmdeploy - ERROR - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.
How does it work for you? i see that mmdet is 2.19 while mmdeploy is 0.12-somerev. in mmdet 2.19, mmdet/models/detectors/two_stage.py, simple_test_rpn is called as such: (which is different from mmdeploy's implementation)
proposal_list = self.rpn_head.simple_test_rpn(x, img_metas)
do i have to manually build the mmdeploy lib? and link my mmdet lib? isn't there a mmdeploy lib already built for mmdet 2.25? or at least 2.19?
Maybe you can try the prebuilt package v0.5.0 or v0.6.0. We provide a prebuilt package since v0.5.0.
I have a similar problem.I want convert my rtmpose model to onnx, but when I convert it, I get a mistake. "ERROR - not enough values to unpack (expected 2, got 1)", it's like this:
!python tools/deploy.py \
configs/mmpose/pose-detection_onnxruntime-fp16_static.py \
ConvertFolder/rtmpose-s-Ear.py \
ConvertFolder/rtm_pose.pth \
ConvertFolder/DSC_5384.jpg \
--work-dir ConvertFolder/mmpose2onnx_rtmpose2 \
--dump-info
error:
05/14 21:03:05 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:05 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "mmpose_tasks" registry tree. As a workaround, the current "mmpose_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:07 - mmengine - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
05/14 21:03:09 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:09 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "mmpose_tasks" registry tree. As a workaround, the current "mmpose_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
Loads checkpoint by local backend from path: ConvertFolder/rtm_pose.pth
05/14 21:03:10 - mmengine - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future.
05/14 21:03:10 - mmengine - INFO - Export PyTorch model to ONNX: ConvertFolder/mmpose2onnx_rtmpose2\end2end.onnx.
05/14 21:03:10 - mmengine - WARNING - Can not find torch._C._jit_pass_onnx_autograd_function_process, function rewrite will not be applied
05/14 21:03:10 - mmengine - WARNING - Can not find models.yolox_pose_head.YOLOXPoseHead.predict, function rewrite will not be applied
05/14 21:03:10 - mmengine - WARNING - Can not find models.yolox_pose_head.YOLOXPoseHead.predict_by_feat, function rewrite will not be applied
05/14 21:03:12 - mmengine - INFO - Execute onnx optimize passes.
05/14 21:03:12 - mmengine - WARNING - Can not optimize model, please build torchscipt extension.
More details: https://github.com/open-mmlab/mmdeploy/tree/main/docs/en/experimental/onnx_optimizer.md
05/14 21:03:12 - mmengine - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx
05/14 21:03:13 - mmengine - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in main process
05/14 21:03:13 - mmengine - INFO - Finish pipeline mmdeploy.apis.utils.utils.to_backend
05/14 21:03:13 - mmengine - INFO - visualize onnxruntime model start.
05/14 21:03:16 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:16 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "mmpose_tasks" registry tree. As a workaround, the current "mmpose_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:16 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "backend_segmentors" registry tree. As a workaround, the current "backend_segmentors" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
05/14 21:03:16 - mmengine - INFO - Successfully loaded onnxruntime custom ops from [d:\Applications\Programming\miniconda3\envs\openmmlab\lib\site-packages\mmdeploy\lib\mmdeploy_onnxruntime_ops.dll](file:///D:/Applications/Programming/miniconda3/envs/openmmlab/lib/site-packages/mmdeploy/lib/mmdeploy_onnxruntime_ops.dll)
Traceback (most recent call last):
File "d:\Applications\Programming\miniconda3\envs\openmmlab\lib\site-packages\mmdeploy\utils\utils.py", line 41, in target_wrapper
result = target(*args, **kwargs)
...
File "d:\Applications\Programming\miniconda3\envs\openmmlab\lib\site-packages\mmdeploy\codebase\mmpose\deploy\pose_detection_model.py", line 108, in forward
batch_pred_x, batch_pred_y = batch_outputs
ValueError: not enough values to unpack (expected 2, got 1)
05/14 21:03:17 - mmengine - ERROR - tools/deploy.py - create_process - 82 - visualize onnxruntime model failed.
Output is truncated. View as a [scrollable element](command:cellOutput.enableScrolling?c87e4cfb-ce15-40ab-abff-dce93395543e) or open in a [text editor](command:workbench.action.openLargeOutput?c87e4cfb-ce15-40ab-abff-dce93395543e). Adjust cell output [settings](command:workbench.action.openSettings?%5B%22%40tag%3AnotebookOutputLayout%22%5D)...
[d:\Applications\Programming\miniconda3\envs\openmmlab\lib\site-packages\torch\autocast_mode.py:141](file:///D:/Applications/Programming/miniconda3/envs/openmmlab/lib/site-packages/torch/autocast_mode.py:141): UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
2024-05-14:21:03:17 - root - ERROR - not enough values to unpack (expected 2, got 1)
Checklist
Describe the bug
I am trying to convert a CascadeRCNN to TensorRT via partitioning (i am trying to extract the fpn output embeddings at predict time), i am using the deploy.py script, both partitions are converted to ONNX, but when converting the second partition to TensorRT, i enounter this error:
the partitioning config:
Reproduction
Environment