open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.61k stars 605 forks source link

[Bug] 运行deploy.py从onnx转engine过程中报错 #1289

Closed kc-w closed 10 months ago

kc-w commented 1 year ago

Checklist

Describe the bug

2022-11-02 15:04:39,839 - mmdeploy - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess e:\projecttest\mmsegmentation\mmseg\models\decode_heads\decode_head.py:94: UserWarning: For binary segmentation, we suggest usingout_channels = 1 to define the outputchannels of segmentor, and use thresholdto convert seg_logist into a predictionapplying a threshold warnings.warn('For binary segmentation, we suggest using' e:\projecttest\mmsegmentation\mmseg\models\losses\cross_entropy_loss.py:235: UserWarning: Default avg_non_ignore is False, if you would like to ignore the certain label and average loss over non-ignore labels, which is the same with PyTorch official cross_entropy, set avg_non_ignore=True. warnings.warn( load checkpoint from local path: E:/projectTest/mmsegmentation/result/iter_200.pth The model and loaded state dict do not match exactly

unexpected key in source state_dict: auxiliary_head.conv_seg.weight, auxiliary_head.conv_seg.bias, auxiliary_head.convs.0.conv.weight, auxiliary_head.convs.0.bn.weight, auxiliary_head.convs.0.bn.bias, auxiliary_head.convs.0.bn.running_mean, auxiliary_head.convs.0.bn.running_var, auxiliary_head.convs.0.bn.num_batches_tracked, auxiliary_head.convs.1.conv.weight, auxiliary_head.convs.1.bn.weight, auxiliary_head.convs.1.bn.bias, auxiliary_head.convs.1.bn.running_mean, auxiliary_head.convs.1.bn.running_var, auxiliary_head.convs.1.bn.num_batches_tracked, decode_head.conv_seg.weight, decode_head.conv_seg.bias, decode_head.psp_modules.0.1.conv.weight, decode_head.psp_modules.0.1.bn.weight, decode_head.psp_modules.0.1.bn.bias, decode_head.psp_modules.0.1.bn.running_mean, decode_head.psp_modules.0.1.bn.running_var, decode_head.psp_modules.0.1.bn.num_batches_tracked, decode_head.psp_modules.1.1.conv.weight, decode_head.psp_modules.1.1.bn.weight, decode_head.psp_modules.1.1.bn.bias, decode_head.psp_modules.1.1.bn.running_mean, decode_head.psp_modules.1.1.bn.running_var, decode_head.psp_modules.1.1.bn.num_batches_tracked, decode_head.psp_modules.2.1.conv.weight, decode_head.psp_modules.2.1.bn.weight, decode_head.psp_modules.2.1.bn.bias, decode_head.psp_modules.2.1.bn.running_mean, decode_head.psp_modules.2.1.bn.running_var, decode_head.psp_modules.2.1.bn.num_batches_tracked, decode_head.psp_modules.3.1.conv.weight, decode_head.psp_modules.3.1.bn.weight, decode_head.psp_modules.3.1.bn.bias, decode_head.psp_modules.3.1.bn.running_mean, decode_head.psp_modules.3.1.bn.running_var, decode_head.psp_modules.3.1.bn.num_batches_tracked, decode_head.bottleneck.conv.weight, decode_head.bottleneck.bn.weight, decode_head.bottleneck.bn.bias, decode_head.bottleneck.bn.running_mean, decode_head.bottleneck.bn.running_var, decode_head.bottleneck.bn.num_batches_tracked

missing keys in source state_dict: neck.lateral_convs.0.conv.weight, neck.lateral_convs.0.conv.bias, neck.lateral_convs.1.conv.weight, neck.lateral_convs.1.conv.bias, neck.lateral_convs.2.conv.weight, neck.lateral_convs.2.conv.bias, neck.lateral_convs.3.conv.weight, neck.lateral_convs.3.conv.bias, neck.fpn_convs.0.conv.weight, neck.fpn_convs.0.conv.bias, neck.fpn_convs.1.conv.weight, neck.fpn_convs.1.conv.bias, neck.fpn_convs.2.conv.weight, neck.fpn_convs.2.conv.bias, neck.fpn_convs.3.conv.weight, neck.fpn_convs.3.conv.bias, decode_head.0.conv_seg.weight, decode_head.0.conv_seg.bias, decode_head.0.scale_heads.0.0.conv.weight, decode_head.0.scale_heads.0.0.bn.weight, decode_head.0.scale_heads.0.0.bn.bias, decode_head.0.scale_heads.0.0.bn.running_mean, decode_head.0.scale_heads.0.0.bn.running_var, decode_head.0.scale_heads.1.0.conv.weight, decode_head.0.scale_heads.1.0.bn.weight, decode_head.0.scale_heads.1.0.bn.bias, decode_head.0.scale_heads.1.0.bn.running_mean, decode_head.0.scale_heads.1.0.bn.running_var, decode_head.0.scale_heads.2.0.conv.weight, decode_head.0.scale_heads.2.0.bn.weight, decode_head.0.scale_heads.2.0.bn.bias, decode_head.0.scale_heads.2.0.bn.running_mean, decode_head.0.scale_heads.2.0.bn.running_var, decode_head.0.scale_heads.2.2.conv.weight, decode_head.0.scale_heads.2.2.bn.weight, decode_head.0.scale_heads.2.2.bn.bias, decode_head.0.scale_heads.2.2.bn.running_mean, decode_head.0.scale_heads.2.2.bn.running_var, decode_head.0.scale_heads.3.0.conv.weight, decode_head.0.scale_heads.3.0.bn.weight, decode_head.0.scale_heads.3.0.bn.bias, decode_head.0.scale_heads.3.0.bn.running_mean, decode_head.0.scale_heads.3.0.bn.running_var, decode_head.0.scale_heads.3.2.conv.weight, decode_head.0.scale_heads.3.2.bn.weight, decode_head.0.scale_heads.3.2.bn.bias, decode_head.0.scale_heads.3.2.bn.running_mean, decode_head.0.scale_heads.3.2.bn.running_var, decode_head.0.scale_heads.3.4.conv.weight, decode_head.0.scale_heads.3.4.bn.weight, decode_head.0.scale_heads.3.4.bn.bias, decode_head.0.scale_heads.3.4.bn.running_mean, decode_head.0.scale_heads.3.4.bn.running_var, decode_head.1.fcs.0.conv.weight, decode_head.1.fcs.0.conv.bias, decode_head.1.fcs.1.conv.weight, decode_head.1.fcs.1.conv.bias, decode_head.1.fcs.2.conv.weight, decode_head.1.fcs.2.conv.bias, decode_head.1.fc_seg.weight, decode_head.1.fc_seg.bias

2022-11-02 15:04:50,086 - mmdeploy - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 2022-11-02 15:04:50,087 - mmdeploy - INFO - Export PyTorch model to ONNX: E:\projectTest\mmdeploy\tools\end2end.onnx. 2022-11-02 15:04:50,199 - mmdeploy - WARNING - Can not find torch._C._jit_pass_onnx_deduplicate_initializers, function rewrite will not be applied e:\projecttest\mmsegmentation\mmseg\ops\wrappers.py:48: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! size = [int(t self.scale_factor) for t in x.shape[-2:]] 2022-11-02 15:05:02,761 - mmdeploy - WARNING - cfg.subdivision_num_points would be changed from 8196 to 3840 due to restriction in TensorRT TopK layer E:\projectTest\mmdeploy\mmdeploy\pytorch\functions\topk.py:57: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if k > size: e:\projecttest\mmsegmentation\mmseg\models\decode_heads\point_head.py:352: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! num_points = min(height width, num_points) E:\projectTest\mmdeploy\mmdeploy\pytorch\functions\mod.py:19: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). return input - (input // other) other e:\projecttest\mmsegmentation\mmseg\models\decode_heads\point_head.py:362: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). point_coords[:, :, 1] = h_step / 2.0 + (point_indices // WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. WARNING: The shape inference of mmdeploy::grid_sampler type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. 2022-11-02 15:05:10,629 - mmdeploy - INFO - Execute onnx optimize passes. 2022-11-02 15:05:10,629 - mmdeploy - WARNING - Can not optimize model, please build torchscipt extension. More details: https://github.com/open-mmlab/mmdeploy/blob/master/docs/en/experimental/onnx_optimizer.md 2022-11-02 15:05:12,160 - mmdeploy - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx 2022-11-02 15:05:25,873 - mmdeploy - INFO - Start pipeline mmdeploy.backend.tensorrt.onnx2tensorrt.onnx2tensorrt in subprocess 2022-11-02 15:05:26,173 - mmdeploy - WARNING - Could not load the library of tensorrt plugins. Because the file does not exist: [11/02/2022-15:05:27] [TRT] [I] [MemUsageChange] Init CUDA: CPU +368, GPU +0, now: CPU 13145, GPU 1150 (MiB) [11/02/2022-15:05:29] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +363, GPU +104, now: CPU 13676, GPU 1254 (MiB) [11/02/2022-15:05:30] [TRT] [W] onnx2trt_utils.cpp:369: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. [11/02/2022-15:05:30] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped [11/02/2022-15:05:30] [TRT] [W] onnx2trt_utils.cpp:395: One or more weights outside the range of INT32 was clamped Process Process-3: [11/02/2022-15:05:30] [TRT] [I] No importer registered for op: grid_sampler. Attempting to import as plugin. [11/02/2022-15:05:30] [TRT] [I] Searching for plugin: grid_sampler, plugin_version: 1, plugin_namespace: Traceback (most recent call last): File "D:\Python\Python39\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() File "D:\Python\Python39\lib\multiprocessing\process.py", line 108, in run self._target(self._args, *self._kwargs) File "E:\projectTest\mmdeploy\mmdeploy\apis\core\pipeline_manager.py", line 107, in call ret = func(args, **kwargs) File "E:\projectTest\mmdeploy\mmdeploy\backend\tensorrt\onnx2tensorrt.py", line 79, in onnx2tensorrt from_onnx( File "E:\projectTest\mmdeploy\mmdeploy\backend\tensorrt\utils.py", line 166, in from_onnx raise RuntimeError(f'Failed to parse onnx, {error_msgs}') RuntimeError: Failed to parse onnx, In node 510 (importFallbackPluginImporter): UNSUPPORTED_NODE: Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"

2022-11-02 15:05:32,096 - mmdeploy - ERROR - mmdeploy.backend.tensorrt.onnx2tensorrt.onnx2tensorrt with Call id: 1 failed. exit.

Reproduction

parser.add_argument('--deploy_cfg',
                    default='E:/projectTest/mmdeploy/configs/mmseg/segmentation_tensorrt-fp16_dynamic-512x1024-2048x2048.py',
                    help='deploy config path')
parser.add_argument('--model_cfg', default='E:/projectTest/mmsegmentation/configs/pspnet/MyPsp.py',
                    help='model config path')
parser.add_argument('--checkpoint', default='E:/projectTest/mmsegmentation/result/iter_200.pth',
                    help='model checkpoint path')
parser.add_argument('--img', default='D:/images/mask/img_dir/train/Image_20220626151509704.jpg',
                    help='image used to convert model model')

Environment

2022-11-02 15:25:04,337 - mmdeploy - INFO - 

2022-11-02 15:25:04,337 - mmdeploy - INFO - **********Environmental information**********
2022-11-02 15:25:12,131 - mmdeploy - INFO - sys.platform: win32
2022-11-02 15:25:12,131 - mmdeploy - INFO - Python: 3.9.12 (tags/v3.9.12:b28265d, Mar 23 2022, 23:52:46) [MSC v.1929 64 bit (AMD64)]
2022-11-02 15:25:12,131 - mmdeploy - INFO - CUDA available: True
2022-11-02 15:25:12,131 - mmdeploy - INFO - GPU 0: NVIDIA GeForce RTX 3060
2022-11-02 15:25:12,131 - mmdeploy - INFO - CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3
2022-11-02 15:25:12,131 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.109
2022-11-02 15:25:12,131 - mmdeploy - INFO - MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.29.30146 版
2022-11-02 15:25:12,131 - mmdeploy - INFO - GCC: n/a
2022-11-02 15:25:12,131 - mmdeploy - INFO - PyTorch: 1.11.0+cu113
2022-11-02 15:25:12,131 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 192829337
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.5.2 (Git Hash a9302535553c73243c632ad3c4c80beec3d19a1e)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.2
  - Magma 2.5.4
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/builder/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.11.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, 

2022-11-02 15:25:12,131 - mmdeploy - INFO - TorchVision: 0.12.0+cu113
2022-11-02 15:25:12,131 - mmdeploy - INFO - OpenCV: 4.5.4
2022-11-02 15:25:12,131 - mmdeploy - INFO - MMCV: 1.6.2
2022-11-02 15:25:12,131 - mmdeploy - INFO - MMCV Compiler: MSVC 192930146
2022-11-02 15:25:12,131 - mmdeploy - INFO - MMCV CUDA Compiler: 11.3
2022-11-02 15:25:12,131 - mmdeploy - INFO - MMDeploy: 0.10.0+c4d428f
2022-11-02 15:25:12,132 - mmdeploy - INFO - 

2022-11-02 15:25:12,132 - mmdeploy - INFO - **********Backend information**********
2022-11-02 15:25:12,689 - mmdeploy - INFO - onnxruntime: 1.13.1 ops_is_avaliable : False
2022-11-02 15:25:12,726 - mmdeploy - INFO - tensorrt: 8.4.1.5   ops_is_avaliable : False
2022-11-02 15:25:12,850 - mmdeploy - INFO - ncnn: None  ops_is_avaliable : False
2022-11-02 15:25:12,854 - mmdeploy - INFO - pplnn_is_avaliable: False
2022-11-02 15:25:12,944 - mmdeploy - INFO - openvino_is_avaliable: True
2022-11-02 15:25:13,000 - mmdeploy - INFO - snpe_is_available: False
2022-11-02 15:25:13,005 - mmdeploy - INFO - ascend_is_available: False
2022-11-02 15:25:13,029 - mmdeploy - INFO - coreml_is_available: False
2022-11-02 15:25:13,029 - mmdeploy - INFO - 

2022-11-02 15:25:13,029 - mmdeploy - INFO - **********Codebase information**********
2022-11-02 15:25:13,108 - mmdeploy - INFO - mmdet:  None
2022-11-02 15:25:13,108 - mmdeploy - INFO - mmseg:  0.29.0
2022-11-02 15:25:13,108 - mmdeploy - INFO - mmcls:  None
2022-11-02 15:25:13,108 - mmdeploy - INFO - mmocr:  None
2022-11-02 15:25:13,108 - mmdeploy - INFO - mmedit: 0.16.0
2022-11-02 15:25:13,108 - mmdeploy - INFO - mmdet3d:    None
2022-11-02 15:25:13,108 - mmdeploy - INFO - mmpose: None
2022-11-02 15:25:13,108 - mmdeploy - INFO - mmrotate:   None

Error traceback

No response

mm-assistant[bot] commented 1 year ago

We recommend using English or English & Chinese for issues so that we could have broader discussion.

RunningLeon commented 1 year ago

@kc-w Hi, you custom pspnet model has custom tensorrt plugin grid_sampler. You should build trt custom ops when installing mmdeploy according to here.

github-actions[bot] commented 10 months ago

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.