open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.79k stars 637 forks source link

[Bug] I successfully converted a Solov2 model to onnx format, but when I was using this onnx model for inference in C++, I encountered an error #2653

Open laishenghui opened 10 months ago

laishenghui commented 10 months ago

Checklist

Describe the bug

i trained a solov2 model, and then i convert this model to onnx. then i use this onnx to infer in c++(on windows system).an error occured。i should say this c++ solution is work correct, because i have use this c++ solution infer some other onnx model,it works well. trainlog:20231228_100516.log 20231228_100516.log convertlog:log.txt log.txt c++ infer error msg: [2024-01-24 09:44:05.870] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "D:\OpenMMLab\MMDeploy\ONNXRuntimeDemo_gpu\bin\Release\AIEdge_onnxruntime_dynamic" [2024-01-24 09:44:39.426] [mmdeploy] [error] [instance_segmentation.cpp:78] OpenCV(4.8.0) C:\GHA-OCV-1_work\ci-gha-workflow\ci-gha-workflow\opencv\modules\core\src\matrix.cpp:767: error: (-215:Assertion failed) 0 <= _rowRange.start && _rowRange.start <= _rowRange.end && _rowRange.end <= m.rows in function 'cv::Mat::Mat'

OpenCV: terminate handler is called! The last OpenCV error is: OpenCV(4.8.0) Error: Assertion failed (0 <= _rowRange.start && _rowRange.start <= _rowRange.end && _rowRange.end <= m.rows) in cv::Mat::Mat, file C:\GHA-OCV-1_work\ci-gha-workflow\ci-gha-workflow\opencv\modules\core\src\matrix.cpp, line 767

Reproduction

image image

i never change the code

Environment

D:\Anaconda3\envs\openmmlab\python.exe D:\OpenMMLab\MMDeploy\mmdeploy-main\tools\check_env.py 
01/24 09:57:26 - mmengine - INFO - 

01/24 09:57:26 - mmengine - INFO - **********Environmental information**********
01/24 10:01:09 - mmengine - INFO - sys.platform: win32
01/24 10:01:09 - mmengine - INFO - Python: 3.8.18 (default, Sep 11 2023, 13:39:12) [MSC v.1916 64 bit (AMD64)]
01/24 10:01:09 - mmengine - INFO - CUDA available: True
01/24 10:01:09 - mmengine - INFO - numpy_random_seed: 2147483648
01/24 10:01:09 - mmengine - INFO - GPU 0: NVIDIA GeForce RTX 3060
01/24 10:01:09 - mmengine - INFO - CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7
01/24 10:01:09 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.7, V11.7.64
01/24 10:01:09 - mmengine - INFO - MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.38.33130 版
01/24 10:01:09 - mmengine - INFO - GCC: n/a
01/24 10:01:09 - mmengine - INFO - PyTorch: 1.13.1+cu117
01/24 10:01:09 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 192829337
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.7
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.5
  - Magma 2.5.4
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/builder/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF, 

01/24 10:01:09 - mmengine - INFO - TorchVision: 0.14.1+cu117
01/24 10:01:09 - mmengine - INFO - OpenCV: 4.8.1
01/24 10:01:09 - mmengine - INFO - MMEngine: 0.10.1
01/24 10:01:09 - mmengine - INFO - MMCV: 2.0.0
01/24 10:01:09 - mmengine - INFO - MMCV Compiler: MSVC 192829924
01/24 10:01:09 - mmengine - INFO - MMCV CUDA Compiler: 11.7
01/24 10:01:09 - mmengine - INFO - MMDeploy: 1.3.1+
01/24 10:01:09 - mmengine - INFO - 

01/24 10:01:09 - mmengine - INFO - **********Backend information**********
01/24 10:01:11 - mmengine - INFO - tensorrt:    None
01/24 10:01:19 - mmengine - INFO - ONNXRuntime: None
01/24 10:01:19 - mmengine - INFO - ONNXRuntime-gpu: 1.15.1
01/24 10:01:19 - mmengine - INFO - ONNXRuntime custom ops:  Available
01/24 10:01:19 - mmengine - INFO - pplnn:   None
01/24 10:01:21 - mmengine - INFO - ncnn:    None
01/24 10:01:22 - mmengine - INFO - snpe:    None
01/24 10:01:22 - mmengine - INFO - openvino:    None
01/24 10:01:23 - mmengine - INFO - torchscript: 1.13.1+cu117
01/24 10:01:23 - mmengine - INFO - torchscript custom ops:  NotAvailable
01/24 10:01:23 - mmengine - INFO - rknn-toolkit:    None
01/24 10:01:23 - mmengine - INFO - rknn-toolkit2:   None
01/24 10:01:24 - mmengine - INFO - ascend:  None
01/24 10:01:24 - mmengine - INFO - coreml:  None
01/24 10:01:24 - mmengine - INFO - tvm: None
01/24 10:01:25 - mmengine - INFO - vacc:    None
01/24 10:01:25 - mmengine - INFO - 

01/24 10:01:25 - mmengine - INFO - **********Codebase information**********
01/24 10:01:26 - mmengine - INFO - mmdet:   3.0.0
01/24 10:01:26 - mmengine - INFO - mmseg:   None
01/24 10:01:26 - mmengine - INFO - mmpretrain:  None
01/24 10:01:26 - mmengine - INFO - mmocr:   None
01/24 10:01:26 - mmengine - INFO - mmagic:  None
01/24 10:01:26 - mmengine - INFO - mmdet3d: None
01/24 10:01:26 - mmengine - INFO - mmpose:  None
01/24 10:01:26 - mmengine - INFO - mmrotate:    None
01/24 10:01:26 - mmengine - INFO - mmaction:    None
01/24 10:01:26 - mmengine - INFO - mmrazor: 1.0.0
01/24 10:01:26 - mmengine - INFO - mmyolo:  None

Process finished with exit code 0

Error traceback

“object_detection.exe”(Win32): 已加载“D:\OpenMMLab\MMDeploy\ONNXRuntimeDemo_gpu\bin\Release\object_detection.exe”。已加载符号。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\ntdll.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\kernel32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\KernelBase.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\ucrtbase.dll”。
“object_detection.exe”(Win32): 已加载“D:\OpenMMLab\MMDeploy\ONNXRuntimeDemo_gpu\bin\Release\mmdeploy.dll”。模块已生成,不包含符号。
“object_detection.exe”(Win32): 已加载“D:\OpenCV\4.8.0\opencv\build\x64\vc16\bin\opencv_world480.dll”。已加载符号。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\gdi32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\win32u.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\gdi32full.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\msvcp_win.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\user32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\ole32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\rpcrt4.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\combase.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\ws2_32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\oleaut32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\comdlg32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\msvcrt.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\SHCore.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\shlwapi.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\shell32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\advapi32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\sechost.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\bcrypt.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\msvcp140.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\vcruntime140.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\vcruntime140_1.dll”。
“object_detection.exe”(Win32): 已加载“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cudart64_110.dll”。模块已生成,不包含符号。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\nvcuda.dll”。
“object_detection.exe”(Win32): 已加载“D:\OpenMMLab\MMDeploy\ONNXRuntimeDemo_gpu\bin\Release\onnxruntime.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\WinSxS\amd64_microsoft.windows.common-controls_6595b64144ccf1df_5.82.19041.3636_none_7931fb75243f97c8\comctl32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\mfplat.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\cfgmgr32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\mf.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\mfreadwrite.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\concrt140.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\cryptbase.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\mfcore.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\crypt32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\powrprof.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\ksuser.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\kernel.appcore.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\imm32.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\bcryptprimitives.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\RTWorkQ.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\umpdc.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\DriverStore\FileRepository\nv_dispig.inf_amd64_7e5fd280efaa5445\nvcuda64.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\version.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\msasn1.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\cryptnet.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\drvstore.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\devobj.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\wldp.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\nvapi64.dll”。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\setupapi.dll”。
线程 0x116c 已退出,返回值为 0 (0x0)。
“object_detection.exe”(Win32): 已加载“D:\OpenMMLab\MMDeploy\ONNXRuntimeDemo_gpu\bin\Release\onnxruntime_providers_shared.dll”。
“object_detection.exe”(Win32): 已加载“D:\OpenMMLab\MMDeploy\ONNXRuntimeDemo_gpu\bin\Release\onnxruntime_providers_cuda.dll”。
“object_detection.exe”(Win32): 已加载“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cudnn64_8.dll”。模块已生成,不包含符号。
“object_detection.exe”(Win32): 已加载“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cufft64_10.dll”。模块已生成,不包含符号。
“object_detection.exe”(Win32): 已加载“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cublas64_11.dll”。模块已生成,不包含符号。
“object_detection.exe”(Win32): 已加载“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cublasLt64_11.dll”。模块已生成,不包含符号。
“object_detection.exe”(Win32): 已卸载“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cublasLt64_11.dll”
“object_detection.exe”(Win32): 已加载“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cublasLt64_11.dll”。模块已生成,不包含符号。
“object_detection.exe”(Win32): 已加载“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cudnn_ops_infer64_8.dll”。模块已生成,不包含符号。
“object_detection.exe”(Win32): 已加载“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cudnn_cnn_infer64_8.dll”。模块已生成,不包含符号。
“object_detection.exe”(Win32): 已加载“C:\Windows\System32\zlibwapi.dll”。模块已生成,不包含符号。
“object_detection.exe”(Win32): 已加载“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin\cudnn_ops_train64_8.dll”。模块已生成,不包含符号。
0x00007FFCE2CCCF19 处(位于 object_detection.exe 中)引发的异常: Microsoft C++ 异常: ipp::IwException,位于内存位置 0x0000003CDFB2C070 处。
0x00007FFCE2CCCF19 处(位于 object_detection.exe 中)引发的异常: Microsoft C++ 异常: cv::Exception,位于内存位置 0x0000003CDFB2D340 处。
0x00007FFCE2CCCF19 处(位于 object_detection.exe 中)引发的异常: Microsoft C++ 异常: [rethrow],位于内存位置 0x0000000000000000 处。
0x00007FFCE2CCCF19 处(位于 object_detection.exe 中)引发的异常: Microsoft C++ 异常: cv::Exception,位于内存位置 0x0000003CDFB2D340 处。
0x00007FFCE2CCCF19 处(位于 object_detection.exe 中)引发的异常: Microsoft C++ 异常: system_error2::status_error<mmdeploy::StatusDomain>,位于内存位置 0x0000003CDFB2EF60 处。
0x00007FFCE29D286E (ucrtbase.dll) (object_detection.exe 中)处有未经处理的异常: 请求了严重的程序退出。
devidlatkin commented 9 months ago

Same

raintowing commented 8 months ago

A solution form #1240 : _SOLO needs to change the export_postprocess_mask = True in configs/mmdet/base/base_instance-segstatic.py manually. It magically solved my solov2 onnx_runtime inference problem as yours.

laishenghui commented 5 months ago

I switched to using YOLOv6, and due to company regulations, I am unable to upload my project

kingstarcraft commented 5 months ago

I switched to using YOLOv6, and due to company regulations, I am unable to upload my project

Thank you, I have just solved this problem via @raintowing solution and using detecor.cxx. I found object_detection.cpp would not be able to work as well as detecor.cxx and object_detection.py.

kingstarcraft commented 5 months ago

A bug I may find is that the location of boxes is incorrect. All of the coordinates of the boxes is zero.