How to convert a Segmenter model?

Checklist

[X] I have searched related issues but cannot get the expected help.
[X] 2. I have read the FAQ documentation but cannot get the expected help.
[X] 3. The bug has not been fixed in the latest version.

Describe the bug

I am trying to convert a Segmenter model to tensor-rt. I am following the https://github.com/open-mmlab/mmdeploy/blob/main/docs/en/get_started.md tutorial and have already successfully converted the Faster RCNN model as shown in the tutorial. The same process fails when I try to convert the Segmenter model.

There are a few warnings during the deploy script, but it finishes with "All process success."

When I try to run the model, using Inference by Model Converter, I get an error:

[05/07/2024-15:00:40] [TRT] [E] 3: [executionContext.cpp::getBindingDimensions::973] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::getBindingDimensions::973, condition: bindingIndex >= 0 && bindingIndex < mEngine.getNbBindings()
)

Reproduction

To convert the Segmenter model, I downloaded the model and config from this repo

I used one of the static deploy configs as suggested on the supported models page.

python mmdeploy/tools/deploy.py mmdeploy/configs/mmseg/segmentation_tensorrt_static-512x512.py ../mmsegmentation/configs/segmenter/segmenter_vit-t_mask_8xb1-160k_ade20k-512x512.py checkpoints/segmenter_vit-t_mask_8x1_512x512_160k_ade20k_20220105_151706-ffcf7509.pth  mmdetection/demo/demo.jpg --work-dir mmdeploy_model/segmenter --device cuda --dump-info

I've attached the full output. There are a few warnings. The deploy script finishes with

Loads checkpoint by local backend from path: checkpoints/segmenter_vit-t_mask_8x1_512x512_160k_ade20k_20220105_151706-ffcf7509.pth
The model and loaded state dict do not match exactly

unexpected key in source state_dict: backbone.patch_embed.projection.bias

05/07 14:51:10 - mmengine - INFO - visualize pytorch model success.
05/07 14:51:10 - mmengine - INFO - All process success.

I wasn't sure if the state dictionary mismatch would be a problem, but since the last message is "All process success", I decided to try the next step. In python, I run

from mmdeploy.apis import inference_model
result = inference_model(
  model_cfg='mmsegmentation/configs/segmenter/segmenter_vit-t_mask_8xb1-160k_ade20k-512x512.py',
  deploy_cfg='mmdeploy/configs/mmseg/segmentation_tensorrt_static-512x512.py',
  backend_files=['mmdeploy_model/faster-rcnn/end2end.engine'],
  img='mmdetection/demo/demo.jpg',
  device='cuda:0')

The output is

  05/07 15:00:36 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
05/07 15:00:36 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "mmseg_tasks" registry tree. As a workaround, the current "mmseg_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
05/07 15:00:36 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "backend_segmentors" registry tree. As a workaround, the current "backend_segmentors" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
05/07 15:00:36 - mmengine - INFO - Successfully loaded tensorrt plugins from /home/cmauceri/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmdeploy/lib/libmmdeploy_tensorrt_ops.so
05/07 15:00:36 - mmengine - INFO - Successfully loaded tensorrt plugins from /home/cmauceri/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmdeploy/lib/libmmdeploy_tensorrt_ops.so
[05/07/2024-15:00:37] [TRT] [W] TensorRT was linked against cuBLAS/cuBLAS LT 11.6.5 but loaded cuBLAS/cuBLAS LT 111.0.3
[05/07/2024-15:00:37] [TRT] [W] TensorRT was linked against cuBLAS/cuBLAS LT 11.6.5 but loaded cuBLAS/cuBLAS LT 111.0.3
[05/07/2024-15:00:40] [TRT] [E] 3: Cannot find binding of given name: output
[05/07/2024-15:00:40] [TRT] [E] 1: Unexpected exception vector::_M_range_check: __n (which is 18446744073709551615) >= this->size() (which is 3)
[05/07/2024-15:00:40] [TRT] [E] 3: Get binding data type failed.
[05/07/2024-15:00:40] [TRT] [E] 3: [executionContext.cpp::getBindingDimensions::973] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::getBindingDimensions::973, condition: bindingIndex >= 0 && bindingIndex < mEngine.getNbBindings()
)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[5], line 2
      1 from mmdeploy.apis import inference_model
----> 2 result = inference_model(
      3   model_cfg='mmsegmentation/configs/segmenter/segmenter_vit-t_mask_8xb1-160k_ade20k-512x512.py',
      4   deploy_cfg='mmdeploy/configs/mmseg/segmentation_tensorrt_static-512x512.py',
      5   backend_files=['mmdeploy_model/faster-rcnn/end2end.engine'],
      6   img='mmdetection/demo/demo.jpg',
      7   device='cuda:0')

File ~/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmdeploy/apis/inference.py:52, in inference_model(model_cfg, deploy_cfg, backend_files, img, device)
     49 model_inputs, _ = task_processor.create_input(img, input_shape)
     51 with torch.no_grad():
---> 52     result = model.test_step(model_inputs)
     54 return result

File ~/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py:145, in BaseModel.test_step(self, data)
    136 """``BaseModel`` implements ``test_step`` the same as ``val_step``.
    137 
    138 Args:
   (...)
    142     list: The predictions of given data.
    143 """
    144 data = self.data_preprocessor(data, False)
--> 145 return self._run_forward(data, mode='predict')

File ~/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py:361, in BaseModel._run_forward(self, data, mode)
    351 """Unpacks data for :meth:`forward`
    352 
    353 Args:
   (...)
    358     dict or list: Results of training or testing mode.
    359 """
    360 if isinstance(data, dict):
--> 361     results = self(**data, mode=mode)
    362 elif isinstance(data, (list, tuple)):
    363     results = self(*data, mode=mode)

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1102, in Module._call_impl(self, *input, **kwargs)
   1098 # If we don't have any hooks, we want to skip the rest of the logic in
   1099 # this function, and just call forward.
   1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102     return forward_call(*input, **kwargs)
   1103 # Do not call functions when jit is used
   1104 full_backward_hooks, non_full_backward_hooks = [], []

File ~/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmdeploy/codebase/mmseg/deploy/segmentation_model.py:86, in End2EndModel.forward(self, inputs, data_samples, mode, **kwargs)
     83     get_root_logger().warning(f'expect input device {self.device}'
     84                               f' but get {inputs.device}.')
     85 inputs = inputs.to(self.device)
---> 86 batch_outputs = self.wrapper({self.input_name:
     87                               inputs})[self.output_names[0]]
     88 return self.pack_result(batch_outputs, data_samples)

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1102, in Module._call_impl(self, *input, **kwargs)
   1098 # If we don't have any hooks, we want to skip the rest of the logic in
   1099 # this function, and just call forward.
   1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102     return forward_call(*input, **kwargs)
   1103 # Do not call functions when jit is used
   1104 full_backward_hooks, non_full_backward_hooks = [], []

File ~/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmdeploy/backend/tensorrt/wrapper.py:167, in TRTWrapper.forward(self, inputs)
    165 idx = self.engine.get_binding_index(output_name)
    166 dtype = torch_dtype_from_trt(self.engine.get_binding_dtype(idx))
--> 167 shape = tuple(self.context.get_binding_shape(idx))
    169 device = torch_device_from_trt(self.engine.get_location(idx))
    170 output = torch.empty(size=shape, dtype=dtype, device=device)

ValueError: __len__() should return >= 0

Environment

05/07 15:24:59 - mmengine - INFO - 

05/07 15:24:59 - mmengine - INFO - **********Environmental information**********
05/07 15:24:59 - mmengine - INFO - sys.platform: linux
05/07 15:24:59 - mmengine - INFO - Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0]
05/07 15:24:59 - mmengine - INFO - CUDA available: True
05/07 15:24:59 - mmengine - INFO - MUSA available: False
05/07 15:24:59 - mmengine - INFO - numpy_random_seed: 2147483648
05/07 15:24:59 - mmengine - INFO - GPU 0: NVIDIA GeForce RTX 3090
05/07 15:24:59 - mmengine - INFO - CUDA_HOME: /usr/local/cuda-11.7
05/07 15:24:59 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.7, V11.7.99
05/07 15:24:59 - mmengine - INFO - GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
05/07 15:24:59 - mmengine - INFO - PyTorch: 1.10.0+cu113
05/07 15:24:59 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

05/07 15:24:59 - mmengine - INFO - TorchVision: 0.11.1+cu113
05/07 15:24:59 - mmengine - INFO - OpenCV: 4.9.0
05/07 15:24:59 - mmengine - INFO - MMEngine: 0.10.4
05/07 15:24:59 - mmengine - INFO - MMCV: 2.0.1
05/07 15:24:59 - mmengine - INFO - MMCV Compiler: GCC 9.3
05/07 15:24:59 - mmengine - INFO - MMCV CUDA Compiler: 11.3
05/07 15:24:59 - mmengine - INFO - MMDeploy: 1.3.1+bc75c9d
05/07 15:24:59 - mmengine - INFO - 

05/07 15:24:59 - mmengine - INFO - **********Backend information**********
05/07 15:24:59 - mmengine - INFO - tensorrt:    8.2.3.0
05/07 15:24:59 - mmengine - INFO - tensorrt custom ops: Available
05/07 15:24:59 - mmengine - INFO - ONNXRuntime: None
05/07 15:24:59 - mmengine - INFO - ONNXRuntime-gpu: 1.8.1
05/07 15:24:59 - mmengine - INFO - ONNXRuntime custom ops:  Available
05/07 15:24:59 - mmengine - INFO - pplnn:   None
05/07 15:24:59 - mmengine - INFO - ncnn:    None
05/07 15:24:59 - mmengine - INFO - snpe:    None
05/07 15:24:59 - mmengine - INFO - openvino:    None
05/07 15:24:59 - mmengine - INFO - torchscript: 1.10.0+cu113
05/07 15:24:59 - mmengine - INFO - torchscript custom ops:  NotAvailable
05/07 15:24:59 - mmengine - INFO - rknn-toolkit:    None
05/07 15:24:59 - mmengine - INFO - rknn-toolkit2:   None
05/07 15:24:59 - mmengine - INFO - ascend:  None
05/07 15:24:59 - mmengine - INFO - coreml:  None
05/07 15:24:59 - mmengine - INFO - tvm: None
05/07 15:24:59 - mmengine - INFO - vacc:    None
05/07 15:24:59 - mmengine - INFO - 

05/07 15:24:59 - mmengine - INFO - **********Codebase information**********
05/07 15:24:59 - mmengine - INFO - mmdet:   3.0.0
05/07 15:24:59 - mmengine - INFO - mmseg:   1.2.2
05/07 15:24:59 - mmengine - INFO - mmpretrain:  None
05/07 15:24:59 - mmengine - INFO - mmocr:   None
05/07 15:24:59 - mmengine - INFO - mmagic:  None
05/07 15:24:59 - mmengine - INFO - mmdet3d: None
05/07 15:24:59 - mmengine - INFO - mmpose:  None
05/07 15:24:59 - mmengine - INFO - mmrotate:    None
05/07 15:24:59 - mmengine - INFO - mmaction:    None
05/07 15:24:59 - mmengine - INFO - mmrazor: None
05/07 15:24:59 - mmengine - INFO - mmyolo:  None

Error traceback

ValueError                                Traceback (most recent call last)
Cell In[5], line 2
      1 from mmdeploy.apis import inference_model
----> 2 result = inference_model(
      3   model_cfg='mmsegmentation/configs/segmenter/segmenter_vit-t_mask_8xb1-160k_ade20k-512x512.py',
      4   deploy_cfg='mmdeploy/configs/mmseg/segmentation_tensorrt_static-512x512.py',
      5   backend_files=['mmdeploy_model/faster-rcnn/end2end.engine'],
      6   img='mmdetection/demo/demo.jpg',
      7   device='cuda:0')

File ~/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmdeploy/apis/inference.py:52, in inference_model(model_cfg, deploy_cfg, backend_files, img, device)
     49 model_inputs, _ = task_processor.create_input(img, input_shape)
     51 with torch.no_grad():
---> 52     result = model.test_step(model_inputs)
     54 return result

File ~/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py:145, in BaseModel.test_step(self, data)
    136 """``BaseModel`` implements ``test_step`` the same as ``val_step``.
    137 
    138 Args:
   (...)
    142     list: The predictions of given data.
    143 """
    144 data = self.data_preprocessor(data, False)
--> 145 return self._run_forward(data, mode='predict')

File ~/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py:361, in BaseModel._run_forward(self, data, mode)
    351 """Unpacks data for :meth:`forward`
    352 
    353 Args:
   (...)
    358     dict or list: Results of training or testing mode.
    359 """
    360 if isinstance(data, dict):
--> 361     results = self(**data, mode=mode)
    362 elif isinstance(data, (list, tuple)):
    363     results = self(*data, mode=mode)

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1102, in Module._call_impl(self, *input, **kwargs)
   1098 # If we don't have any hooks, we want to skip the rest of the logic in
   1099 # this function, and just call forward.
   1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102     return forward_call(*input, **kwargs)
   1103 # Do not call functions when jit is used
   1104 full_backward_hooks, non_full_backward_hooks = [], []

File ~/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmdeploy/codebase/mmseg/deploy/segmentation_model.py:86, in End2EndModel.forward(self, inputs, data_samples, mode, **kwargs)
     83     get_root_logger().warning(f'expect input device {self.device}'
     84                               f' but get {inputs.device}.')
     85 inputs = inputs.to(self.device)
---> 86 batch_outputs = self.wrapper({self.input_name:
     87                               inputs})[self.output_names[0]]
     88 return self.pack_result(batch_outputs, data_samples)

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1102, in Module._call_impl(self, *input, **kwargs)
   1098 # If we don't have any hooks, we want to skip the rest of the logic in
   1099 # this function, and just call forward.
   1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102     return forward_call(*input, **kwargs)
   1103 # Do not call functions when jit is used
   1104 full_backward_hooks, non_full_backward_hooks = [], []

File ~/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmdeploy/backend/tensorrt/wrapper.py:167, in TRTWrapper.forward(self, inputs)
    165 idx = self.engine.get_binding_index(output_name)
    166 dtype = torch_dtype_from_trt(self.engine.get_binding_dtype(idx))
--> 167 shape = tuple(self.context.get_binding_shape(idx))
    169 device = torch_device_from_trt(self.engine.get_location(idx))
    170 output = torch.empty(size=shape, dtype=dtype, device=device)

ValueError: __len__() should return >= 0

open-mmlab / mmdeploy