open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.79k stars 638 forks source link

[Bug] Error when converting rtmpose model to onnx #2309

Closed mozhijing closed 1 year ago

mozhijing commented 1 year ago

Checklist

Describe the bug

RuntimeError: Given groups=1, weight of size [17, 1280, 7, 7], expected input[1, 1310, 12, 9] to have 1280 channels, but got 1310 channels instead

Reproduction

python tools/deploy.py configs/mmpose/pose-detection_simcc_onnxruntime_dynamic.py rtmpose-x_8xb256-700e_coco-384x288.py rtmpose-x_simcc-body7_pt-body7_700e-384x288-71d7b7e9_20230629.pth demo/resources/human-pose.jpg --work-dir mmdeploy_model/rtm-x --device cpu --dump-info

Environment

E:\mmdeploy-main>python tools/check_env.py
07/28 11:42:15 - mmengine - INFO -

07/28 11:42:15 - mmengine - INFO - **********Environmental information**********
07/28 11:42:25 - mmengine - INFO - sys.platform: win32
07/28 11:42:25 - mmengine - INFO - Python: 3.7.9 (tags/v3.7.9:13c94747c7, Aug 17 2020, 18:58:18) [MSC v.1900 64 bit (AMD64)]
07/28 11:42:25 - mmengine - INFO - CUDA available: False
07/28 11:42:25 - mmengine - INFO - numpy_random_seed: 2147483648
07/28 11:42:25 - mmengine - INFO - MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.36.32537 版
07/28 11:42:25 - mmengine - INFO - GCC: n/a
07/28 11:42:25 - mmengine - INFO - PyTorch: 1.13.1+cpu
07/28 11:42:25 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 192829337
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/builder/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=0, USE_CUDNN=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,

07/28 11:42:25 - mmengine - INFO - TorchVision: 0.14.1+cpu
07/28 11:42:25 - mmengine - INFO - OpenCV: 4.7.0
07/28 11:42:25 - mmengine - INFO - MMEngine: 0.7.3
07/28 11:42:25 - mmengine - INFO - MMCV: 2.0.1
07/28 11:42:25 - mmengine - INFO - MMCV Compiler: MSVC 192930148
07/28 11:42:25 - mmengine - INFO - MMCV CUDA Compiler: not available
07/28 11:42:25 - mmengine - INFO - MMDeploy: 1.2.0+
07/28 11:42:25 - mmengine - INFO -

07/28 11:42:25 - mmengine - INFO - **********Backend information**********
07/28 11:42:25 - mmengine - INFO - tensorrt:    None
07/28 11:42:25 - mmengine - INFO - ONNXRuntime: 1.14.1
07/28 11:42:25 - mmengine - INFO - ONNXRuntime-gpu:     None
07/28 11:42:25 - mmengine - INFO - ONNXRuntime custom ops:      NotAvailable
07/28 11:42:25 - mmengine - INFO - pplnn:       None
07/28 11:42:25 - mmengine - INFO - ncnn:        None
07/28 11:42:25 - mmengine - INFO - snpe:        None
07/28 11:42:25 - mmengine - INFO - openvino:    None
07/28 11:42:25 - mmengine - INFO - torchscript: 1.13.1
07/28 11:42:25 - mmengine - INFO - torchscript custom ops:      NotAvailable
07/28 11:42:26 - mmengine - INFO - rknn-toolkit:        None
07/28 11:42:26 - mmengine - INFO - rknn-toolkit2:       None
07/28 11:42:26 - mmengine - INFO - ascend:      None
07/28 11:42:26 - mmengine - INFO - coreml:      None
07/28 11:42:26 - mmengine - INFO - tvm: None
07/28 11:42:26 - mmengine - INFO - vacc:        None
07/28 11:42:26 - mmengine - INFO -

07/28 11:42:26 - mmengine - INFO - **********Codebase information**********
07/28 11:42:26 - mmengine - INFO - mmdet:       3.0.0
07/28 11:42:26 - mmengine - INFO - mmseg:       None
07/28 11:42:26 - mmengine - INFO - mmpretrain:  None
07/28 11:42:26 - mmengine - INFO - mmocr:       None
07/28 11:42:26 - mmengine - INFO - mmagic:      None
07/28 11:42:26 - mmengine - INFO - mmdet3d:     None
07/28 11:42:26 - mmengine - INFO - mmpose:      1.0.0
07/28 11:42:26 - mmengine - INFO - mmrotate:    None
07/28 11:42:26 - mmengine - INFO - mmaction:    None
07/28 11:42:26 - mmengine - INFO - mmrazor:     None
07/28 11:42:26 - mmengine - INFO - mmyolo:      None

Error traceback

Process Process-2:
Traceback (most recent call last):
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py", line 297, in _bootstrap
    self.run()
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "e:\mmdeploy-main\mmdeploy\apis\core\pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "e:\mmdeploy-main\mmdeploy\apis\pytorch2onnx.py", line 111, in torch2onnx
    optimize=optimize)
  File "e:\mmdeploy-main\mmdeploy\apis\core\pipeline_manager.py", line 356, in _wrap
    return self.call_function(func_name_, *args, **kwargs)
  File "e:\mmdeploy-main\mmdeploy\apis\core\pipeline_manager.py", line 326, in call_function
    return self.call_function_local(func_name, *args, **kwargs)
  File "e:\mmdeploy-main\mmdeploy\apis\core\pipeline_manager.py", line 275, in call_function_local
    return pipe_caller(*args, **kwargs)
  File "e:\mmdeploy-main\mmdeploy\apis\core\pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "e:\mmdeploy-main\mmdeploy\apis\onnx\export.py", line 141, in export
    verbose=verbose)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 519, in export
    export_modules_as_functions=export_modules_as_functions,
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 1539, in _export
    dynamic_axes=dynamic_axes,
  File "e:\mmdeploy-main\mmdeploy\apis\onnx\optimizer.py", line 27, in model_to_graph__custom_optimizer
    graph, params_dict, torch_out = ctx.origin_func(*args, **kwargs)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 1111, in _model_to_graph
    graph, params, torch_out, module = _create_jit_graph(model, args)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 987, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\onnx\utils.py", line 896, in _trace_and_get_graph_from_model
    _return_inputs_states=True,
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit\_trace.py", line 1184, in _get_trace_graph
    outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit\_trace.py", line 132, in forward    self._force_outplace,
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\jit\_trace.py", line 118, in wrapper    outs.append(self.inner(*trace_inputs))
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "e:\mmdeploy-main\mmdeploy\apis\onnx\export.py", line 123, in wrapper
    return forward(*arg, **kwargs)
  File "e:\mmdeploy-main\mmdeploy\codebase\mmpose\models\pose_estimators\base.py", line 21, in base_pose_estimator__forward
    return self._forward(inputs)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\mmpose\models\pose_estimators\base.py", line 172, in _forward
    x = self.head.forward(x)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\mmpose\models\heads\coord_cls_heads\rtmcc_head.py", line 149, in forward
    feats = self.final_layer(feats)  # -> B, K, H, W
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 1182, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "C:\Users\mzj\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\conv.py", line 460, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [17, 1280, 7, 7], expected input[1, 1310, 12, 9] to have 1280 channels, but got 1310 channels instead
07/28 11:34:53 - mmengine - ERROR - e:\mmdeploy-main\mmdeploy\apis\core\pipeline_manager.py - pop_mp_output - 81 - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.
RunningLeon commented 1 year ago

hi, pls. double-check if your model config and ckpt are matched at first.

mozhijing commented 1 year ago

The model I'm using is from the rtmpose documentation. https://github.com/open-mmlab/mmpose/tree/main/projects/rtmpose#-model-zoo- image model config:https://github.com/open-mmlab/mmpose/blob/main/projects/rtmpose/rtmpose/body_2d_keypoint/rtmpose-x_8xb256-700e_coco-384x288.py model ckpt:https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/rtmpose-x_simcc-body7_pt-body7_700e-384x288-71d7b7e9_20230629.pth

RunningLeon commented 1 year ago

@mozhijing Could you try other model and configs, this ckpt is not matched with the model config. @Tau-J hi, could you double check it?

model config:https://github.com/open-mmlab/mmpose/blob/main/projects/rtmpose/rtmpose/body_2d_keypoint/rtmpose-x_8xb256-700e_coco-384x288.py model ckpt:https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/rtmpose-x_simcc-body7_pt-body7_700e-384x288-71d7b7e9_20230629.pth

Tau-J commented 1 year ago

@RunningLeon @mozhijing Thanks for your feedback. This is a typo by mistake. I'll fix it in https://github.com/open-mmlab/mmpose/pull/2585

github-actions[bot] commented 1 year ago

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.