open-mmlab / mmpose

OpenMMLab Pose Estimation Toolbox and Benchmark.
https://mmpose.readthedocs.io/en/latest/
Apache License 2.0
5.92k stars 1.26k forks source link

[Bug] Visualize torchscript model failed while deploying RTMPose model to torchscript #2506

Closed andyroro closed 1 year ago

andyroro commented 1 year ago

Prerequisite

Environment

Result: python -c "from mmpose.utils import collect_env; print(collect_env())"

('sys.platform', 'linux'), ('Python', '3.8.10 (default, Nov 14 2022, 12:59:47) [GCC 9.4.0]'), ('CUDA available', True), ('numpy_random_seed', 2147483648), ('GPU 0', 'NVIDIA GeForce RTX 4090'), ('CUDA_HOME', '/usr/local/cuda-12.0'), ('NVCC', 'Cuda compilation tools, release 12.0, V12.0.140'), ('GCC', 'x86_64-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0'), ('PyTorch', '2.0.1+cu117'), ('PyTorch compiling details', 'PyTorch built with:\n - GCC 9.3\n - C++ Version: 201703\n - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications\n - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)\n - OpenMP 201511 (a.k.a. OpenMP 4.5)\n - LAPACK is enabled (usually provided by MKL)\n - NNPACK is enabled\n - CPU capability usage: AVX2\n - CUDA Runtime 11.7\n - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86\n - CuDNN 8.8 (built against CUDA 12.0)\n - Built with CuDNN 8.5\n - Magma 2.6.1\n - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, \n'), ('TorchVision', '0.15.2+cu117'), ('OpenCV', '4.7.0'), ('MMEngine', '0.7.4'), ('MMPose', '1.0.0+')

Result: pip list | grep mm

mmcv 2.0.0 /home/andyc/00_lib/src/mmcv mmdeploy 1.2.0 /home/andyc/00_lib/src/mmdeploy mmdet 3.0.0 /home/andyc/00_lib/src/mmdetection mmengine 0.7.4 /home/andyc/00_lib/src/mmengine mmpose 1.0.0 /home/andyc/00_lib/src/mmpose

Reproduces the problem - code sample

Nope

Reproduces the problem - command or script

rtmpose det torchscript

python3 ./mmdeploy/tools/deploy.py \
./mmdeploy/configs/mmdet/detection/detection_torchscript.py \
./mmpose/projects/rtmpose/rtmdet/person/rtmdet_nano_320-8xb32_coco-person.py \
https://download.openmmlab.com/mmpose/v1/projects/rtmpose/rtmdet_nano_8xb32-100e_coco-obj365-person-05d8511e.pth \
./mmdeploy/demo/resources/human-pose.jpg \
--work-dir ~/03_model/rtmpose-det \
--device cuda \
--show \
--dump-info

rtmpose pose torchscript

python3 ./mmdeploy/tools/deploy.py \
./mmdeploy/configs/mmpose/pose-detection_torchscript.py \
./mmpose/projects/rtmpose/rtmpose/body_2d_keypoint/rtmpose-m_8xb256-420e_coco-256x192.py \
https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/rtmpose-m_simcc-aic-coco_pt-aic-coco_420e-256x192-63eb25f7_20230126.pth \
./mmdeploy/demo/resources/human-pose.jpg \
--work-dir ~/03_model/rtmpose-pose \
--device cuda:0 \
--show \
--dump-info

Reproduces the problem - error message

... 07/04 00:18:27 - mmengine - INFO - Export PyTorch model to torchscript. 07/04 00:18:29 - mmengine - INFO - Save PyTorch model: /home/andyc/03_model/rtmpose-pose/end2end.pt. 07/04 00:18:29 - mmengine - INFO - Finish pipeline mmdeploy.apis.pytorch2torchscript.torch2torchscript 07/04 00:18:30 - mmengine - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in main process 07/04 00:18:30 - mmengine - INFO - Finish pipeline mmdeploy.apis.utils.utils.to_backend 07/04 00:18:30 - mmengine - INFO - visualize torchscript model start. 07/04 00:18:31 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized. 07/04 00:18:31 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "mmpose_tasks" registry tree. As a workaround, the current "mmpose_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized. 07/04 00:18:31 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "backend_segmentors" registry tree. As a workaround, the current "backend_segmentors" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized. /home/andyc/00_lib/src/mmpose/mmpose/datasets/datasets/utils.py:102: UserWarning: The metainfo config file "configs/base/datasets/coco.py" does not exist. A matched config file "/home/andyc/00_lib/src/mmpose/mmpose/.mim/configs/base/datasets/coco.py" will be used instead. warnings.warn( 2023-07-04:00:18:34 - root - ERROR - not enough values to unpack (expected 2, got 1) Traceback (most recent call last): File "/home/andyc/00_lib/src/mmdeploy/mmdeploy/utils/utils.py", line 41, in target_wrapper result = target(args, kwargs) File "/home/andyc/00_lib/src/mmdeploy/mmdeploy/apis/visualize.py", line 72, in visualize_model result = model.test_step(model_inputs)[0] File "/home/andyc/00_lib/src/mmengine/mmengine/model/base_model/base_model.py", line 145, in test_step return self._run_forward(data, mode='predict') # type: ignore File "/home/andyc/00_lib/src/mmengine/mmengine/model/base_model/base_model.py", line 340, in _run_forward results = self(data, mode=mode) File "/home/andyc/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, **kwargs) File "/home/andyc/00_lib/src/mmdeploy/mmdeploy/codebase/mmpose/deploy/pose_detection_model.py", line 104, in forward batch_pred_x, batch_pred_y = batch_outputs ValueError: not enough values to unpack (expected 2, got 1) 07/04 00:18:35 - mmengine - ERROR - ./mmdeploy/tools/deploy.py - create_process - 82 - visualize torchscript model failed.

Additional information

While deploying 'rtmpose' model to torchscript by mmdeploy, an error is occurs after saving model successfully. RTMDet was working well, but just RTMPose has an error.

Tau-J commented 1 year ago

@andyroro Sorry for late reply. I think the config need to be modified to deploy rtmpose to torchscript, as the pose-detection_torchscript.py is not for simcc-based model. Maybe you can raise an issue to mmdeploy to add a config for rtmpose, or you need to modify it by yourself.