open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.7k stars 618 forks source link

[Bug] Conflict while using mmdeploy.apis.utils.build_task_processor to build rtmdet_processor and rtmpose_processor simultaneously in one script #2495

Open Geniukx opened 11 months ago

Geniukx commented 11 months ago

Checklist

Describe the bug

I try to write a script to achieve a 2-stage hand pose estimation task, utilizing rtmdet and rtmpose respectively. I encountered the problem that when I initialize the rtmdet_processor and rtmpose_processor simultaneously in a script, an error occurred while reaching rtmdet_processor.create_input(frame, rshape). But the bug will not appear as long as I comment out the code for initializing rtmpose_processor. In addition, the two stage work well when running separately, so I speculate that there might be some conflict between the two processor, maybe within the initializing method build_task_processor?

Reproduction

import torch
from mmdeploy.apis.utils import build_task_processor
from mmdeploy.utils import get_input_shape, load_config
from utils.utils import Config

rtmdet = Config(
    deploy_config_path = rf'path_to_deploy_config',
    model_config_path = rf'path_to_model_config,
    onnx = [rf"path_to_onnx_file"],
)

rtmpose = Config(
    deploy_config_path = rf'path_to_deploy_config',
    model_config_path = rf'path_to_model_config,
    onnx = [rf"path_to_onnx_file"],
)

rtmdet_deploy_cfg, rtmdet_model_cfg = load_config(rtmdet.deploy_config_path, rtmdet.model_config_path)
rtmdet_processor = build_task_processor(rtmdet_model_cfg, rtmdet_deploy_cfg, 'cuda')
rtmdet_model = rtmdet_processor.build_backend_model(rtmdet.onnx)
rtmdet.update(
    input_shape = get_input_shape(rtmdet_deploy_cfg)
)

rtmpose_deploy_cfg, rtmpose_model_cfg = load_config(rtmpose.deploy_config_path, rtmpose.model_config_path)
rtmpose_processor = build_task_processor(rtmpose_model_cfg, rtmpose_deploy_cfg, 'cuda')
rtmpose_model = rtmpose_processor.build_backend_model(rtmpose.onnx)
rtmpose.update(
    input_shape = get_input_shape(rtmpose_deploy_cfg)
)

frame = rf"path_to_test_image"
model_inputs, _ = rtmdet_processor.create_input(frame, rtmdet.input_shape)

with torch.no_grad():
    det_result = rtmdet_model.test_step(model_inputs)

indexes = det_result[0].pred_instances.scores > 0.8

for bbox, label in zip(det_result[0].pred_instances.bboxes[indexes].numpy().astype(int), det_result[0].pred_instances.labels[indexes].numpy()):
    if label correspond with hand:
        roi = frame[bbox[1]: bbox[3], bbox[0]: bbox[2]]
        model_inputs, _ = rtmpose_processor.create_input(roi, rtmpose.input_shape)
        with torch.no_grad():
            est_result = rtmpose_model.test_step(model_inputs)

Environment

10/16 19:26:17 - mmengine - INFO - **********Environmental information**********
10/16 19:26:21 - mmengine - INFO - sys.platform: win32
10/16 19:26:21 - mmengine - INFO - Python: 3.8.18 (default, Sep 11 2023, 13:39:12) [MSC v.1916 64 bit (AMD64)]
10/16 19:26:21 - mmengine - INFO - CUDA available: True
10/16 19:26:21 - mmengine - INFO - numpy_random_seed: 2147483648
10/16 19:26:21 - mmengine - INFO - GPU 0: NVIDIA GeForce RTX 2070
10/16 19:26:21 - mmengine - INFO - CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7
10/16 19:26:21 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.7, V11.7.99
10/16 19:26:21 - mmengine - INFO - GCC: n/a
10/16 19:26:21 - mmengine - INFO - PyTorch: 2.0.1
10/16 19:26:21 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 193431937
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.7
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.5
  - Magma 2.5.4
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=C:/cb/pytorch_1000000000000/work/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj /FS -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=OFF, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,

10/16 19:26:21 - mmengine - INFO - TorchVision: 0.15.2
10/16 19:26:21 - mmengine - INFO - OpenCV: 4.8.1
10/16 19:26:21 - mmengine - INFO - MMEngine: 0.8.5
10/16 19:26:21 - mmengine - INFO - MMCV: 2.0.1
10/16 19:26:21 - mmengine - INFO - MMCV Compiler: MSVC 192930148
10/16 19:26:21 - mmengine - INFO - MMCV CUDA Compiler: 11.7
10/16 19:26:21 - mmengine - INFO - MMDeploy: 1.3.0+c4dc10d
10/16 19:26:21 - mmengine - INFO -

10/16 19:26:21 - mmengine - INFO - **********Backend information**********
10/16 19:26:22 - mmengine - INFO - tensorrt:    None
10/16 19:26:22 - mmengine - INFO - ONNXRuntime: None
10/16 19:26:22 - mmengine - INFO - ONNXRuntime-gpu:     1.16.1
10/16 19:26:22 - mmengine - INFO - ONNXRuntime custom ops:      Available
10/16 19:26:22 - mmengine - INFO - pplnn:       None
10/16 19:26:22 - mmengine - INFO - ncnn:        None
10/16 19:26:22 - mmengine - INFO - snpe:        None
10/16 19:26:22 - mmengine - INFO - openvino:    None
10/16 19:26:22 - mmengine - INFO - torchscript: 2.0.1
10/16 19:26:22 - mmengine - INFO - torchscript custom ops:      NotAvailable
10/16 19:26:22 - mmengine - INFO - rknn-toolkit:        None
10/16 19:26:22 - mmengine - INFO - rknn-toolkit2:       None
10/16 19:26:22 - mmengine - INFO - ascend:      None
10/16 19:26:22 - mmengine - INFO - coreml:      None
10/16 19:26:22 - mmengine - INFO - tvm: None
10/16 19:26:22 - mmengine - INFO - vacc:        None
10/16 19:26:22 - mmengine - INFO -

10/16 19:26:22 - mmengine - INFO - **********Codebase information**********
10/16 19:26:22 - mmengine - INFO - mmdet:       3.1.0
10/16 19:26:22 - mmengine - INFO - mmseg:       None
10/16 19:26:22 - mmengine - INFO - mmpretrain:  1.0.2
10/16 19:26:22 - mmengine - INFO - mmocr:       None
10/16 19:26:22 - mmengine - INFO - mmagic:      None
10/16 19:26:22 - mmengine - INFO - mmdet3d:     None
10/16 19:26:22 - mmengine - INFO - mmpose:      1.1.0
10/16 19:26:22 - mmengine - INFO - mmrotate:    None
10/16 19:26:22 - mmengine - INFO - mmaction:    None
10/16 19:26:22 - mmengine - INFO - mmrazor:     None
10/16 19:26:22 - mmengine - INFO - mmyolo:      None

Error traceback

10/16 19:27:12 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
10/16 19:27:12 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "mmdet_tasks" registry tree. As a workaround, the current "mmdet_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
10/16 19:27:12 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "backend_detectors" registry tree. As a workaround, the current "backend_detectors" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
10/16 19:27:12 - mmengine - INFO - Successfully loaded onnxruntime custom ops from D:\Development\MiniConda\envs\mmlab\lib\site-packages\mmdeploy\lib\mmdeploy_onnxruntime_ops.dll
2023-10-16 19:27:13.0775910 [W:onnxruntime:, session_state.cc:1162 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2023-10-16 19:27:13.0862843 [W:onnxruntime:, session_state.cc:1164 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
10/16 19:27:15 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
10/16 19:27:15 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "mmpose_tasks" registry tree. As a workaround, the current "mmpose_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
10/16 19:27:15 - mmengine - WARNING - Failed to search registry with scope "mmpose" in the "backend_segmentors" registry tree. As a workaround, the current "backend_segmentors" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmpose" is a correct scope, or whether the registry is initialized.
10/16 19:27:15 - mmengine - INFO - Successfully loaded onnxruntime custom ops from D:\Development\MiniConda\envs\mmlab\lib\site-packages\mmdeploy\lib\mmdeploy_onnxruntime_ops.dll
2023-10-16 19:27:16.0366841 [W:onnxruntime:, session_state.cc:1162 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2023-10-16 19:27:16.0449353 [W:onnxruntime:, session_state.cc:1164 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
Traceback (most recent call last):
  File "e:/Internship/hand_pose_estimation/codes/infer.py", line 33, in <module>
    model_inputs, _ = rtmdet_processor.create_input(frame, rtmdet.input_shape)
  File "D:\Development\MiniConda\envs\mmlab\lib\site-packages\mmdeploy\codebase\mmdet\deploy\object_detection.py", line 206, in create_input
    test_pipeline = Compose(pipeline)
  File "D:\Development\MiniConda\envs\mmlab\lib\site-packages\mmcv\transforms\wrappers.py", line 66, in __init__
    transform = TRANSFORMS.build(transform)
  File "D:\Development\MiniConda\envs\mmlab\lib\site-packages\mmengine\registry\registry.py", line 570, in build
    return self.build_func(cfg, *args, **kwargs, registry=self)
  File "D:\Development\MiniConda\envs\mmlab\lib\site-packages\mmengine\registry\build_functions.py", line 100, in build_from_cfg
    raise KeyError(
KeyError: 'PackDetInputs is not in the mmpose::transform registry. Please check whether the value of `PackDetInputs` is correct or it was registered as expected. More details can be found at https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html#import-the-custom-module'
kdmxen commented 1 month ago

我也遇到类似问题,两个模型,mmpose和mmseg使用了不同的default_scope,加载mmpose的时候一切正常,加载mmseg的时候也找不到PackSegInputs,我只能说这种设计真的很脑残...