open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.79k stars 637 forks source link

[Bug] mismatched data type FLOAT vs HALF #2337

Closed mmvxc closed 1 year ago

mmvxc commented 1 year ago

Checklist

Describe the bug

loading mmdeploy_trt_net.dll ... loading mmdeploy_ort_net.dll ... [2023-08-08 17:18:12.721] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "rtmdet-ort/rtmdet-m" [2023-08-08 17:18:14.069] [mmdeploy] [error] [tensor.cpp:137] mismatched data type FLOAT vs HALF

Process finished with exit code -1073740791 (0xC0000409)

Reproduction

我想要使用onnxruntime fp16

1、我将RTMPose-m 转换为 ONNX deploy_cfg:https://github.com/open-mmlab/mmdeploy/blob/main/configs/mmdet/detection/detection_onnxruntime-fp16_dynamic.py model_cfg:https://github.com/open-mmlab/mmpose/blob/main/projects/rtmpose/rtmdet/person/rtmdet_m_640-8xb32_coco-person.py checkpoint:https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/rtmpose-l_simcc-aic-coco_pt-aic-coco_420e-256x192-f016ffe0_20230126.pth

2、执行文件object_detection.py实现目标检测,模型采用第一步转换后的ONNX object_detection.py:https://github.com/open-mmlab/mmdeploy/blob/main/demo/python/object_detection.py

Environment

C:\Users\PC\Desktop\thq\2\MMDeploy>python tools/check_env.py
08/08 17:19:40 - mmengine - INFO -

08/08 17:19:40 - mmengine - INFO - **********Environmental information**********
08/08 17:19:42 - mmengine - INFO - sys.platform: win32
08/08 17:19:42 - mmengine - INFO - Python: 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
08/08 17:19:42 - mmengine - INFO - CUDA available: True
08/08 17:19:42 - mmengine - INFO - numpy_random_seed: 2147483648
08/08 17:19:42 - mmengine - INFO - GPU 0: NVIDIA GeForce RTX 2060
08/08 17:19:42 - mmengine - INFO - CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7
08/08 17:19:42 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.7, V11.7.99
08/08 17:19:42 - mmengine - INFO - MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.29.30151 版
08/08 17:19:42 - mmengine - INFO - GCC: n/a
08/08 17:19:42 - mmengine - INFO - PyTorch: 2.0.1+cu117
08/08 17:19:42 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 193431937
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.7
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.5
  - Magma 2.5.4
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj /FS -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=OFF, TORCH_VERSION=2.0.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,

08/08 17:19:42 - mmengine - INFO - TorchVision: 0.15.2+cu117
08/08 17:19:42 - mmengine - INFO - OpenCV: 4.5.5
08/08 17:19:42 - mmengine - INFO - MMEngine: 0.8.4
08/08 17:19:42 - mmengine - INFO - MMCV: 2.0.1
08/08 17:19:42 - mmengine - INFO - MMCV Compiler: MSVC 192930148
08/08 17:19:42 - mmengine - INFO - MMCV CUDA Compiler: 11.7
08/08 17:19:42 - mmengine - INFO - MMDeploy: 1.2.0+8a59bae
08/08 17:19:42 - mmengine - INFO -

08/08 17:19:42 - mmengine - INFO - **********Backend information**********
08/08 17:19:43 - mmengine - INFO - tensorrt:    8.4.2.4
08/08 17:19:43 - mmengine - INFO - tensorrt custom ops: NotAvailable
08/08 17:19:43 - mmengine - INFO - ONNXRuntime: None
08/08 17:19:43 - mmengine - INFO - ONNXRuntime-gpu:     1.15.1
08/08 17:19:43 - mmengine - INFO - ONNXRuntime custom ops:      Available
08/08 17:19:43 - mmengine - INFO - pplnn:       None
08/08 17:19:43 - mmengine - INFO - ncnn:        None
08/08 17:19:43 - mmengine - INFO - snpe:        None
08/08 17:19:43 - mmengine - INFO - openvino:    None
08/08 17:19:43 - mmengine - INFO - torchscript: 2.0.1+cu117
08/08 17:19:43 - mmengine - INFO - torchscript custom ops:      NotAvailable
08/08 17:19:43 - mmengine - INFO - rknn-toolkit:        None
08/08 17:19:43 - mmengine - INFO - rknn-toolkit2:       None
08/08 17:19:43 - mmengine - INFO - ascend:      None
08/08 17:19:43 - mmengine - INFO - coreml:      None
08/08 17:19:43 - mmengine - INFO - tvm: None
08/08 17:19:43 - mmengine - INFO - vacc:        None
08/08 17:19:43 - mmengine - INFO -

08/08 17:19:43 - mmengine - INFO - **********Codebase information**********
08/08 17:19:43 - mmengine - INFO - mmdet:       3.1.0
08/08 17:19:43 - mmengine - INFO - mmseg:       None
08/08 17:19:43 - mmengine - INFO - mmpretrain:  None
08/08 17:19:43 - mmengine - INFO - mmocr:       None
08/08 17:19:43 - mmengine - INFO - mmagic:      None
08/08 17:19:43 - mmengine - INFO - mmdet3d:     None
08/08 17:19:43 - mmengine - INFO - mmpose:      None
08/08 17:19:43 - mmengine - INFO - mmrotate:    None
08/08 17:19:43 - mmengine - INFO - mmaction:    None
08/08 17:19:43 - mmengine - INFO - mmrazor:     None
08/08 17:19:43 - mmengine - INFO - mmyolo:      None

Error traceback

No response

RunningLeon commented 1 year ago

@irexyc hi, does sdk support ort-fp16 inference, which requires input's type is fp16.

irexyc commented 1 year ago

Currently, the sdk does not support onnx-fp16.

Are you do the infernece with cpu backend? For cpu backend, onnxruntime probably doesn't have acceleration compared with fp32, you can verify it with onnxruntime python api. https://onnxruntime.ai/docs/api/python/api_summary.html

mmvxc commented 1 year ago

谢谢,我采用的是gpu

irexyc commented 1 year ago

谢谢,我采用的是gpu

Currently, you can use tensorrt for gpu device.