[Bug] AssertionError: Failed to create TensorRT engine

Checklist

[X] I have searched related issues but cannot get the expected help.
[X] 2. I have read the FAQ documentation but cannot get the expected help.
[X] 3. The bug has not been fixed in the latest version.

Describe the bug

(mmdeploy_flgpu) I:\AILab>python mmdeploy/tools/deploy.py ^ More? MMDEPLOYGPU\MMDeploy\configs\mmocr\text-recognition\text-recognition_tensorrt_dynamic-1x32x32-1x32x640.py ^ More? mmocr\configs\textrecog\sar\sar_resnet31_sequential-decoder_5e_st-sub_mj-sub_sa_real.py ^ More? 文字识别模型部署\sar_resnet31_sequential-decoder_5e_st-sub_mj-sub_sa_real_20220915_185451-1fd6b1fc.pth ^ More? mmocr\demo\demo_text_det.jpg ^ More? --work-dir mmdeploy_model/ocr/sar-trt ^ More? --device cuda ^ More? --dump-info 09/28 10:10:21 - mmengine - WARNING - Failed to search registry with scope "mmocr" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmocr" is a correct scope, or whether the registry is initialized. 09/28 10:10:21 - mmengine - WARNING - Failed to search registry with scope "mmocr" in the "mmocr_tasks" registry tree. As a workaround, the current "mmocr_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmocr" is a correct scope, or whether the registry is initialized. 09/28 10:10:23 - mmengine - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess 09/28 10:10:24 - mmengine - WARNING - Failed to search registry with scope "mmocr" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmocr" is a correct scope, or whether the registry is initialized. 09/28 10:10:24 - mmengine - WARNING - Failed to search registry with scope "mmocr" in the "mmocr_tasks" registry tree. As a workaround, the current "mmocr_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmocr" is a correct scope, or whether the registry is initialized. Loads checkpoint by local backend from path: 文字识别模型部署\sar_resnet31_sequential-decoder_5e_st-sub_mj-sub_sa_real_20220915_185451-1fd6b1fc.pth The model and loaded state dict do not match exactly

unexpected key in source state_dict: data_preprocessor.mean, data_preprocessor.std

09/28 10:10:25 - mmengine - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 09/28 10:10:25 - mmengine - INFO - Export PyTorch model to ONNX: mmdeploy_model/ocr/sar-trt\end2end.onnx. 09/28 10:10:26 - mmengine - WARNING - Can not find torch.nn.functional.scaled_dot_product_attention, function rewrite will not be applied 09/28 10:10:26 - mmengine - WARNING - Can not find mmdet.models.utils.transformer.PatchMerging.forward, function rewrite will not be applied i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\codebase\mmocr\models\text_recognition\sar_encoder.py:37: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert len(data_samples) == feat.size(0) i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\codebase\mmocr\models\text_recognition\sar_encoder.py:45: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! h_feat = int(feat.size(2)) i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\codebase\mmocr\models\text_recognition\sar_encoder.py:57: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. valid_step = torch.tensor(T valid_ratio).ceil().long() - 1 i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\codebase\mmocr\models\text_recognition\sar_encoder.py:57: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). valid_step = torch.tensor(T valid_ratio).ceil().long() - 1 i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\codebase\mmocr\models\text_recognition\sar_decoder.py:119: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert c == 1 i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\codebase\mmocr\models\text_recognition\sar_decoder.py:126: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. valid_width = torch.tensor(w valid_ratio).ceil().long() i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\codebase\mmocr\models\text_recognition\sar_decoder.py:126: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). valid_width = torch.tensor(w valid_ratio).ceil().long() i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\pytorch\functions\tensor_setitem.py:38: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! stop = stop if stop >= 0 else self_shape[i] + stop D:\miniconda3\envs\mmdeploy_flgpu\lib\site-packages\torch\onnx\symbolic_opset9.py:4315: UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with LSTM can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model. warnings.warn( D:\miniconda3\envs\mmdeploy_flgpu\lib\site-packages\torch\onnx_internal\jit_utils.py:258: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\passes\onnx\shape_type_inference.cpp:1888.) _C._jit_pass_onnx_node_shape_type_inference(node, params_dict, opset_version) D:\miniconda3\envs\mmdeploy_flgpu\lib\site-packages\torch\onnx\utils.py:687: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\passes\onnx\shape_type_inference.cpp:1888.) _C._jit_pass_onnx_graph_shape_type_inference( D:\miniconda3\envs\mmdeploy_flgpu\lib\site-packages\torch\onnx\utils.py:1178: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\passes\onnx\shape_type_inference.cpp:1888.) _C._jit_pass_onnx_graph_shape_type_inference( 09/28 10:10:35 - mmengine - INFO - Execute onnx optimize passes. 09/28 10:10:39 - mmengine - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx 09/28 10:10:40 - mmengine - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in subprocess 09/28 10:10:40 - mmengine - INFO - Successfully loaded tensorrt plugins from i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\lib\mmdeploy_tensorrt_ops.dll [09/28/2023-10:10:41] [TRT] [I] [MemUsageChange] Init CUDA: CPU +495, GPU +0, now: CPU 12001, GPU 1269 (MiB) [09/28/2023-10:10:42] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +387, GPU +104, now: CPU 12575, GPU 1373 (MiB) [libprotobuf WARNING E:\Perforce\rboissel_devdt_windows\sw\gpgpu\MachineLearning\DIT\dev\nvmake\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:604] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h. [libprotobuf WARNING E:\Perforce\rboissel_devdt_windows\sw\gpgpu\MachineLearning\DIT\dev\nvmake\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:81] The total number of bytes read was 734290051 [09/28/2023-10:10:42] [TRT] [I] ---------------------------------------------------------------- [09/28/2023-10:10:42] [TRT] [I] Input filename: mmdeploy_model/ocr/sar-trt\end2end.onnx [09/28/2023-10:10:42] [TRT] [I] ONNX IR version: 0.0.6 [09/28/2023-10:10:42] [TRT] [I] Opset version: 11 [09/28/2023-10:10:42] [TRT] [I] Producer name: pytorch [09/28/2023-10:10:42] [TRT] [I] Producer version: 1.13.0 [09/28/2023-10:10:42] [TRT] [I] Domain: [09/28/2023-10:10:42] [TRT] [I] Model version: 0 [09/28/2023-10:10:42] [TRT] [I] Doc string: [09/28/2023-10:10:42] [TRT] [I] ---------------------------------------------------------------- [libprotobuf WARNING E:\Perforce\rboissel_devdt_windows\sw\gpgpu\MachineLearning\DIT\dev\nvmake\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:604] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h. [libprotobuf WARNING E:\Perforce\rboissel_devdt_windows\sw\gpgpu\MachineLearning\DIT\dev\nvmake\externals\protobuf\3.0.0\src\google\protobuf\io\coded_stream.cc:81] The total number of bytes read was 734290051 [09/28/2023-10:10:42] [TRT] [W] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [09/28/2023-10:10:42] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped [09/28/2023-10:10:42] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:43] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:43] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:43] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:43] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:43] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:43] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:43] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:43] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:43] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:44] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:44] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:44] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:44] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:44] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:45] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:45] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:45] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:45] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:46] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:46] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:46] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:47] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:47] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:47] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:48] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:48] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:48] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:49] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output. [09/28/2023-10:10:50] [TRT] [E] 4: [network.cpp::nvinfer1::Network::validate::3008] Error Code 4: Internal Error (input: for dimension number 1 in profile 0 does not match network definition (got min=1, opt=1, max=1), expected min=opt=max=3).) [09/28/2023-10:10:50] [TRT] [E] 2: [builder.cpp::nvinfer1::builder::Builder::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. ) Process Process-3: Traceback (most recent call last): File "D:\miniconda3\envs\mmdeploy_flgpu\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() File "D:\miniconda3\envs\mmdeploy_flgpu\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, *self._kwargs) File "i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\apis\core\pipeline_manager.py", line 107, in call ret = func(args, **kwargs) File "i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\apis\utils\utils.py", line 98, in to_backend return backend_mgr.to_backend( File "i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\backend\tensorrt\backend_manager.py", line 127, in to_backend onnx2tensorrt( File "i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\backend\tensorrt\onnx2tensorrt.py", line 79, in onnx2tensorrt from_onnx( File "i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\backend\tensorrt\utils.py", line 248, in from_onnx assert engine is not None, 'Failed to create TensorRT engine' AssertionError: Failed to create TensorRT engine 09/28 10:10:50 - mmengine - ERROR - i:\ailab\mmdeploygpu\mmdeploy\mmdeploy\apis\core\pipeline_manager.py - pop_mp_output - 80 - mmdeploy.apis.utils.utils.to_backend with Call id: 1 failed. exit.

Reproduction

Environment

9/28 10:14:36 - mmengine - INFO - **********Environmental information**********
09/28 10:14:40 - mmengine - INFO - sys.platform: win32
09/28 10:14:40 - mmengine - INFO - Python: 3.8.13 (default, Oct 19 2022, 22:38:03) [MSC v.1916 64 bit (AMD64)]
09/28 10:14:40 - mmengine - INFO - CUDA available: True
09/28 10:14:40 - mmengine - INFO - numpy_random_seed: 2147483648
09/28 10:14:40 - mmengine - INFO - GPU 0: NVIDIA GeForce RTX 4070 Laptop GPU
09/28 10:14:40 - mmengine - INFO - CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6
09/28 10:14:40 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.6, V11.6.55
09/28 10:14:40 - mmengine - INFO - MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.37.32824 版
09/28 10:14:40 - mmengine - INFO - GCC: n/a
09/28 10:14:40 - mmengine - INFO - PyTorch: 1.13.0+cu116
09/28 10:14:40 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 192829337
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.6
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.3.2  (built against CUDA 11.5)
  - Magma 2.5.4
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.6, CUDNN_VERSION=8.3.2, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/builder/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,    

09/28 10:14:40 - mmengine - INFO - TorchVision: 0.14.0+cu116
09/28 10:14:40 - mmengine - INFO - OpenCV: 4.8.0
09/28 10:14:40 - mmengine - INFO - MMEngine: 0.8.4
09/28 10:14:40 - mmengine - INFO - MMCV: 2.0.1
09/28 10:14:40 - mmengine - INFO - MMCV Compiler: MSVC 192930148
09/28 10:14:40 - mmengine - INFO - MMCV CUDA Compiler: 11.6
09/28 10:14:40 - mmengine - INFO - MMDeploy: 1.3.0+
09/28 10:14:40 - mmengine - INFO -

09/28 10:14:40 - mmengine - INFO - **********Backend information**********
09/28 10:14:40 - mmengine - INFO - tensorrt:    8.4.3.1
09/28 10:14:40 - mmengine - INFO - tensorrt custom ops: Available
09/28 10:14:40 - mmengine - INFO - ONNXRuntime: 1.8.1
09/28 10:14:40 - mmengine - INFO - ONNXRuntime-gpu:     1.16.0
09/28 10:14:40 - mmengine - INFO - ONNXRuntime custom ops:      NotAvailable
09/28 10:14:40 - mmengine - INFO - pplnn:       None
09/28 10:14:40 - mmengine - INFO - ncnn:        None
09/28 10:14:40 - mmengine - INFO - snpe:        None
09/28 10:14:40 - mmengine - INFO - openvino:    None
09/28 10:14:40 - mmengine - INFO - torchscript: 1.13.0+cu116
09/28 10:14:40 - mmengine - INFO - torchscript custom ops:      NotAvailable
09/28 10:14:40 - mmengine - INFO - rknn-toolkit:        None
09/28 10:14:40 - mmengine - INFO - rknn-toolkit2:       None
09/28 10:14:40 - mmengine - INFO - ascend:      None
09/28 10:14:40 - mmengine - INFO - coreml:      None
09/28 10:14:40 - mmengine - INFO - tvm: None
09/28 10:14:40 - mmengine - INFO - vacc:        None
09/28 10:14:40 - mmengine - INFO -

09/28 10:14:40 - mmengine - INFO - **********Codebase information**********
09/28 10:14:40 - mmengine - INFO - mmdet:       3.0.0
09/28 10:14:40 - mmengine - INFO - mmseg:       None
09/28 10:14:40 - mmengine - INFO - mmpretrain:  None
09/28 10:14:40 - mmengine - INFO - mmocr:       1.0.1
09/28 10:14:40 - mmengine - INFO - mmagic:      None
09/28 10:14:40 - mmengine - INFO - mmdet3d:     None
09/28 10:14:40 - mmengine - INFO - mmpose:      None
09/28 10:14:40 - mmengine - INFO - mmrotate:    1.0.0rc1
09/28 10:14:40 - mmengine - INFO - mmaction:    None
09/28 10:14:40 - mmengine - INFO - mmrazor:     None
09/28 10:14:40 - mmengine - INFO - mmyolo:      None

Error traceback

No response

open-mmlab / mmdeploy