open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.77k stars 635 forks source link

[Bug] Can not deploy yolo_x_s mmdetection model #1883

Open aixiaodewugege opened 1 year ago

aixiaodewugege commented 1 year ago

Checklist

Describe the bug

I try to turn yolox_s into onnx format , but failed.

Reproduction

python mmdeploy/tools/deploy.py mmdeploy/configs/mmdet/detection/detection_onnxruntime_dynamic.py mmdetect ion/configs/yolox/yolox_s_8x8_300e_coco.py ~/.cache/torch/hub/checkpoints/yolox_s_8x8_300e_coco_20211121_095711-4592a793.pth mmdetection/demo/demo.jpg --test-img mmdetection/demo/demo.jpg --work-dir ./result --device cpu --dump-info

Environment

2023-03-15 19:42:58,561 - mmdeploy - INFO - **********Environmental information**********
2023-03-15 19:42:58,671 - mmdeploy - INFO - sys.platform: linux
2023-03-15 19:42:58,672 - mmdeploy - INFO - Python: 3.7.16 (default, Jan 17 2023, 22:20:44) [GCC 11.2.0]
2023-03-15 19:42:58,672 - mmdeploy - INFO - CUDA available: True
2023-03-15 19:42:58,672 - mmdeploy - INFO - GPU 0: NVIDIA GeForce RTX 3090
2023-03-15 19:42:58,672 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
2023-03-15 19:42:58,672 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.7, V11.7.64
2023-03-15 19:42:58,672 - mmdeploy - INFO - GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
2023-03-15 19:42:58,672 - mmdeploy - INFO - PyTorch: 1.13.1+cu117
2023-03-15 19:42:58,672 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.7
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.5
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 

2023-03-15 19:42:58,672 - mmdeploy - INFO - TorchVision: 0.14.1
2023-03-15 19:42:58,672 - mmdeploy - INFO - OpenCV: 4.7.0
2023-03-15 19:42:58,672 - mmdeploy - INFO - MMCV: 1.7.1
2023-03-15 19:42:58,672 - mmdeploy - INFO - MMCV Compiler: GCC 9.3
2023-03-15 19:42:58,672 - mmdeploy - INFO - MMCV CUDA Compiler: 11.7
2023-03-15 19:42:58,672 - mmdeploy - INFO - MMDeploy: 0.13.0+34c6866
2023-03-15 19:42:58,672 - mmdeploy - INFO - 

2023-03-15 19:42:58,672 - mmdeploy - INFO - **********Backend information**********
2023-03-15 19:42:58,692 - mmdeploy - INFO - tensorrt:   8.5.3.1
2023-03-15 19:42:58,692 - mmdeploy - INFO - tensorrt custom ops:        NotAvailable
2023-03-15 19:42:58,708 - mmdeploy - INFO - ONNXRuntime:        1.8.1
2023-03-15 19:42:58,708 - mmdeploy - INFO - ONNXRuntime-gpu:    None
2023-03-15 19:42:58,708 - mmdeploy - INFO - ONNXRuntime custom ops:     Available
2023-03-15 19:42:58,708 - mmdeploy - INFO - pplnn:      None
2023-03-15 19:42:58,709 - mmdeploy - INFO - ncnn:       None
2023-03-15 19:42:58,709 - mmdeploy - INFO - snpe:       None
2023-03-15 19:42:58,710 - mmdeploy - INFO - openvino:   2022.3.0
2023-03-15 19:42:58,710 - mmdeploy - INFO - torchscript:        1.13.1
2023-03-15 19:42:58,710 - mmdeploy - INFO - torchscript custom ops:     NotAvailable
2023-03-15 19:42:58,729 - mmdeploy - INFO - rknn-toolkit:       None
2023-03-15 19:42:58,729 - mmdeploy - INFO - rknn2-toolkit:      None
2023-03-15 19:42:58,730 - mmdeploy - INFO - ascend:     None
2023-03-15 19:42:58,730 - mmdeploy - INFO - coreml:     None
2023-03-15 19:42:58,730 - mmdeploy - INFO - tvm:        None
2023-03-15 19:42:58,730 - mmdeploy - INFO - 

2023-03-15 19:42:58,730 - mmdeploy - INFO - **********Codebase information**********
2023-03-15 19:42:58,731 - mmdeploy - INFO - mmdet:      2.28.1
2023-03-15 19:42:58,731 - mmdeploy - INFO - mmseg:      None
2023-03-15 19:42:58,731 - mmdeploy - INFO - mmcls:      None
2023-03-15 19:42:58,731 - mmdeploy - INFO - mmocr:      None
2023-03-15 19:42:58,731 - mmdeploy - INFO - mmedit:     None
2023-03-15 19:42:58,731 - mmdeploy - INFO - mmdet3d:    None
2023-03-15 19:42:58,731 - mmdeploy - INFO - mmpose:     0.29.0
2023-03-15 19:42:58,731 - mmdeploy - INFO - mmrotate:   None
2023-03-15 19:42:58,731 - mmdeploy - INFO - mmaction:   0.24.1

Error traceback

/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/__init__.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  'On January 1, 2023, MMCV will release v2.0.0, in which it will remove '
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/__init__.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  'On January 1, 2023, MMCV will release v2.0.0, in which it will remove '
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/__init__.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  'On January 1, 2023, MMCV will release v2.0.0, in which it will remove '
2023-03-15 19:28:00,133 - mmdeploy - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
load checkpoint from local path: /home/wushuchen/.cache/torch/hub/checkpoints/yolox_s_8x8_300e_coco_20211121_095711-4592a793.pth
2023-03-15 19:28:00,836 - mmdeploy - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 
2023-03-15 19:28:00,837 - mmdeploy - INFO - Export PyTorch model to ONNX: ./result/end2end.onnx.
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/core/optimizers/function_marker.py:158: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  ys_shape = tuple(int(s) for s in ys.shape)
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/codebase/mmdet/core/post_processing/bbox_nms.py:97: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  iou_threshold = torch.tensor([iou_threshold], dtype=torch.float32)
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/codebase/mmdet/core/post_processing/bbox_nms.py:98: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  score_threshold = torch.tensor([score_threshold], dtype=torch.float32)
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/pytorch/functions/topk.py:28: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  k = torch.tensor(k, device=input.device, dtype=torch.long)
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/pytorch/functions/topk.py:34: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return ctx.origin_func(input, k, dim=dim, largest=largest, sorted=sorted)
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:38: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  score_threshold = float(score_threshold)
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:39: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  iou_threshold = float(iou_threshold)
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/ops/nms.py:171: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert boxes.size(1) == 4
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/ops/nms.py:172: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert boxes.size(0) == scores.size(0)
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/onnx/symbolic_opset9.py:5409: UserWarning: Exporting aten::index operator of advanced indexing in opset 11 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results.
  "Exporting aten::index operator of advanced indexing in opset "
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:84: FutureWarning: 'torch.onnx._patch_torch._graph_op' is deprecated in version 1.13 and will be removed in version 1.14. Please note 'g.op()' is to be removed from torch.Graph. Please open a GitHub issue if you need this functionality..
  max_output_boxes_per_class, dtype=torch.long))
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:96: FutureWarning: 'torch.onnx._patch_torch._graph_op' is deprecated in version 1.13 and will be removed in version 1.14. Please note 'g.op()' is to be removed from torch.Graph. Please open a GitHub issue if you need this functionality..
  max_output_boxes_per_class, iou_threshold, score_threshold)
2023-03-15 19:28:02,121 - mmdeploy - INFO - Execute onnx optimize passes.
2023-03-15 19:28:02,307 - mmdeploy - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx
2023-03-15 19:28:02,532 - mmdeploy - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in main process
2023-03-15 19:28:02,538 - mmdeploy - INFO - Finish pipeline mmdeploy.apis.utils.utils.to_backend
2023-03-15 19:28:02,538 - mmdeploy - INFO - visualize onnxruntime model start.
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/__init__.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  'On January 1, 2023, MMCV will release v2.0.0, in which it will remove '
2023-03-15 19:28:03,975 - mmdeploy - INFO - Successfully loaded onnxruntime custom ops from /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/lib/libmmdeploy_onnxruntime_ops.so
2023-03-15 19:28:04,189 - mmdeploy - ERROR - visualize onnxruntime model failed.
(open-mmlab) wushuchen@wushuchen:~/projects/open-mmlab$ python mmdeploy/tools/deploy.py mmdeploy/configs/mmdet/detection/detection_onnxruntime_dynamic.py mmdetection/configs/yolox/yolox_s_8x8_300e_coco.py ~/.cache/torch/hub/checkpoints/yolox_s_8x8_300e_coco_20211121_095711-4592a793.pth mmdetection/demo/demo.jpg --work-dir ./result  --device cpu
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/__init__.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  'On January 1, 2023, MMCV will release v2.0.0, in which it will remove '
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/__init__.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  'On January 1, 2023, MMCV will release v2.0.0, in which it will remove '
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/__init__.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  'On January 1, 2023, MMCV will release v2.0.0, in which it will remove '
2023-03-15 19:33:19,328 - mmdeploy - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
load checkpoint from local path: /home/wushuchen/.cache/torch/hub/checkpoints/yolox_s_8x8_300e_coco_20211121_095711-4592a793.pth
2023-03-15 19:33:20,002 - mmdeploy - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 
2023-03-15 19:33:20,002 - mmdeploy - INFO - Export PyTorch model to ONNX: ./result/end2end.onnx.
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/core/optimizers/function_marker.py:158: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  ys_shape = tuple(int(s) for s in ys.shape)
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/codebase/mmdet/core/post_processing/bbox_nms.py:97: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  iou_threshold = torch.tensor([iou_threshold], dtype=torch.float32)
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/codebase/mmdet/core/post_processing/bbox_nms.py:98: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  score_threshold = torch.tensor([score_threshold], dtype=torch.float32)
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/pytorch/functions/topk.py:28: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  k = torch.tensor(k, device=input.device, dtype=torch.long)
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/pytorch/functions/topk.py:34: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  return ctx.origin_func(input, k, dim=dim, largest=largest, sorted=sorted)
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:38: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  score_threshold = float(score_threshold)
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:39: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  iou_threshold = float(iou_threshold)
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/ops/nms.py:171: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert boxes.size(1) == 4
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/ops/nms.py:172: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert boxes.size(0) == scores.size(0)
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/onnx/symbolic_opset9.py:5409: UserWarning: Exporting aten::index operator of advanced indexing in opset 11 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results.
  "Exporting aten::index operator of advanced indexing in opset "
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:84: FutureWarning: 'torch.onnx._patch_torch._graph_op' is deprecated in version 1.13 and will be removed in version 1.14. Please note 'g.op()' is to be removed from torch.Graph. Please open a GitHub issue if you need this functionality..
  max_output_boxes_per_class, dtype=torch.long))
/home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:96: FutureWarning: 'torch.onnx._patch_torch._graph_op' is deprecated in version 1.13 and will be removed in version 1.14. Please note 'g.op()' is to be removed from torch.Graph. Please open a GitHub issue if you need this functionality..
  max_output_boxes_per_class, iou_threshold, score_threshold)
2023-03-15 19:33:21,274 - mmdeploy - INFO - Execute onnx optimize passes.
2023-03-15 19:33:21,452 - mmdeploy - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx
2023-03-15 19:33:21,667 - mmdeploy - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in main process
2023-03-15 19:33:21,673 - mmdeploy - INFO - Finish pipeline mmdeploy.apis.utils.utils.to_backend
2023-03-15 19:33:21,673 - mmdeploy - INFO - visualize onnxruntime model start.
/home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/__init__.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  'On January 1, 2023, MMCV will release v2.0.0, in which it will remove '
2023-03-15 19:33:23,094 - mmdeploy - INFO - Successfully loaded onnxruntime custom ops from /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/lib/libmmdeploy_onnxruntime_ops.so
2023-03-15 19:33:23,299 - mmdeploy - ERROR - visualize onnxruntime model failed.
hanrui1sensetime commented 1 year ago

Could you please provide your onnx file generated by deploy.py? I need to compare with well-worked model. Or you can have a try to convert this model to onnx using Deployee Platform with mmdeploy-1.x branch.

aixiaodewugege commented 1 year ago

Could you please provide your onnx file generated by deploy.py? I need to compare with well-worked model. Or you can have a try to convert this model to onnx using Deployee Platform with mmdeploy-1.x branch.

Thanks for your reply. I have upload my onnx file to here. Could you please help me have a look? I think the problem is in inference part, for the error is visualize onnxruntime model failed. 链接:https://pan.baidu.com/s/1fmE3t7CWzsfY2IGDtsilAg 提取码:C74k

hanrui1sensetime commented 1 year ago

Could you please provide your onnx file generated by deploy.py? I need to compare with well-worked model. Or you can have a try to convert this model to onnx using Deployee Platform with mmdeploy-1.x branch.

Thanks for your reply. I have upload my onnx file to here. Could you please help me have a look? I think the problem is in inference part, for the error is visualize onnxruntime model failed. 链接:https://pan.baidu.com/s/1fmE3t7CWzsfY2IGDtsilAg 提取码:C74k

Could you please substitute this statement to

visualize_model(model_cfg_path, deploy_cfg_path, backend_files, args.test_img, args.device, **kwargs)

and post the failed log?

aixiaodewugege commented 1 year ago

(open-mmlab) wushuchen@wushuchen:~/projects/open-mmlab$ python mmdeploy/tools/deploy.py mmdeploy/configs/mmdet/detection/detection_onnxruntime_dynamic.py mmdetection/configs/yolox/yolox_tiny_8x8_300e_coco.py mmdetection/checkpoint/yolox_tiny_8x8_300e_coco_20211124_171234-b4047906.pth mmdeploy/demo/resources/det.jpg --work-dir mmdeploy_models/mmdetection/yolox --device cpu --dump-info /home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/init.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details. 'On January 1, 2023, MMCV will release v2.0.0, in which it will remove ' /home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/init.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details. 'On January 1, 2023, MMCV will release v2.0.0, in which it will remove ' /home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/init.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details. 'On January 1, 2023, MMCV will release v2.0.0, in which it will remove ' 2023-03-20 20:10:53,050 - mmdeploy - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess load checkpoint from local path: mmdetection/checkpoint/yolox_tiny_8x8_300e_coco_20211124_171234-b4047906.pth 2023-03-20 20:10:53,722 - mmdeploy - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 2023-03-20 20:10:53,723 - mmdeploy - INFO - Export PyTorch model to ONNX: mmdeploy_models/mmdetection/yolox/end2end.onnx. /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/core/optimizers/function_marker.py:158: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! ys_shape = tuple(int(s) for s in ys.shape) /home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3190.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/codebase/mmdet/core/post_processing/bbox_nms.py:97: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. iou_threshold = torch.tensor([iou_threshold], dtype=torch.float32) /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/codebase/mmdet/core/post_processing/bbox_nms.py:98: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. score_threshold = torch.tensor([score_threshold], dtype=torch.float32) /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/pytorch/functions/topk.py:28: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. k = torch.tensor(k, device=input.device, dtype=torch.long) /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/pytorch/functions/topk.py:34: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! return ctx.origin_func(input, k, dim=dim, largest=largest, sorted=sorted) /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:38: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! score_threshold = float(score_threshold) /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:39: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! iou_threshold = float(iou_threshold) /home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/ops/nms.py:171: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert boxes.size(1) == 4 /home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/ops/nms.py:172: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert boxes.size(0) == scores.size(0) /home/wushuchen/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/onnx/symbolic_opset9.py:5409: UserWarning: Exporting aten::index operator of advanced indexing in opset 11 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results. "Exporting aten::index operator of advanced indexing in opset " /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:84: FutureWarning: 'torch.onnx._patch_torch._graph_op' is deprecated in version 1.13 and will be removed in version 1.14. Please note 'g.op()' is to be removed from torch.Graph. Please open a GitHub issue if you need this functionality.. max_output_boxes_per_class, dtype=torch.long)) /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/mmcv/ops/nms.py:96: FutureWarning: 'torch.onnx._patch_torch._graph_op' is deprecated in version 1.13 and will be removed in version 1.14. Please note 'g.op()' is to be removed from torch.Graph. Please open a GitHub issue if you need this functionality.. max_output_boxes_per_class, iou_threshold, score_threshold) 2023-03-20 20:10:55,037 - mmdeploy - INFO - Execute onnx optimize passes. 2023-03-20 20:10:55,187 - mmdeploy - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx 2023-03-20 20:10:55,475 - mmdeploy - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in main process 2023-03-20 20:10:55,483 - mmdeploy - INFO - Finish pipeline mmdeploy.apis.utils.utils.to_backend 2023-03-20 20:10:55,504 - mmdeploy - INFO - Successfully loaded onnxruntime custom ops from /home/wushuchen/projects/open-mmlab/mmdeploy/mmdeploy/lib/libmmdeploy_onnxruntime_ops.so free(): invalid pointer Aborted (core dumped)

hanrui1sensetime commented 1 year ago

free(): invalid pointer

Maybe you can have a try for another det model like RetinaNet etc, to discriminate the bug is from model structure or onnxruntime. You can also post the log on this issue.

aixiaodewugege commented 1 year ago

free(): invalid pointer

Maybe you can have a try for another det model like RetinaNet etc, to discriminate the bug is from model structure or onnxruntime. You can also post the log on this issue.

also tried on rtmdet model on 1.x branch , still can not inference.

03/21 13:12:47 - mmengine - INFO - Execute onnx optimize passes. 03/21 13:12:48 - mmengine - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx 03/21 13:12:48 - mmengine - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in main process 03/21 13:12:48 - mmengine - INFO - Finish pipeline mmdeploy.apis.utils.utils.to_backend 03/21 13:12:48 - mmengine - INFO - visualize onnxruntime model start. 03/21 13:12:49 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized. 03/21 13:12:49 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "mmdet_tasks" registry tree. As a workaround, the current "mmdet_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized. 03/21 13:12:49 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized. 03/21 13:12:49 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "backend_detectors" registry tree. As a workaround, the current "backend_detectors" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized. 03/21 13:12:49 - mmengine - INFO - Successfully loaded onnxruntime custom ops from /home/wushuchen/projects/mmyolo/mmdeploy/mmdeploy/lib/libmmdeploy_onnxruntime_ops.so /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_vector.h:932: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = int; _Alloc = std::allocator; std::vector<_Tp, _Alloc>::reference = int&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed. 03/21 13:12:49 - mmengine - ERROR - mmdeploy/tools/deploy.py - create_process - 82 - visualize onnxruntime model failed.

hanrui1sensetime commented 1 year ago

free(): invalid pointer

Maybe you can have a try for another det model like RetinaNet etc, to discriminate the bug is from model structure or onnxruntime. You can also post the log on this issue.

also tried on rtmdet model on 1.x branch , still can not inference.

03/21 13:12:47 - mmengine - INFO - Execute onnx optimize passes. 03/21 13:12:48 - mmengine - INFO - Finish pipeline mmdeploy.apis.pytorch2onnx.torch2onnx 03/21 13:12:48 - mmengine - INFO - Start pipeline mmdeploy.apis.utils.utils.to_backend in main process 03/21 13:12:48 - mmengine - INFO - Finish pipeline mmdeploy.apis.utils.utils.to_backend 03/21 13:12:48 - mmengine - INFO - visualize onnxruntime model start. 03/21 13:12:49 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized. 03/21 13:12:49 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "mmdet_tasks" registry tree. As a workaround, the current "mmdet_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized. 03/21 13:12:49 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized. 03/21 13:12:49 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "backend_detectors" registry tree. As a workaround, the current "backend_detectors" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized. 03/21 13:12:49 - mmengine - INFO - Successfully loaded onnxruntime custom ops from /home/wushuchen/projects/mmyolo/mmdeploy/mmdeploy/lib/libmmdeploy_onnxruntime_ops.so /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_vector.h:932: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = int; _Alloc = std::allocator; std::vector<_Tp, _Alloc>::reference = int&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed. 03/21 13:12:49 - mmengine - ERROR - mmdeploy/tools/deploy.py - create_process - 82 - visualize onnxruntime model failed.

Try onnxruntime-gpu and with cuda on.

aixiaodewugege commented 1 year ago

Thanks for your reply. After set the device to cuda with follow command, I can convert the model. python mmdeploy/tools/deploy.py mmdeploy/configs/mmdet/detection/detection_onnxruntime_dynamic.py mmdetection/configs/yolox/yolox_s_8xb8-300e_coco.py ~/.cache/torch/hub/checkpoints/yolox_s_8x8_300e_coco_20211121_095711-4592a793.pth mmdeploy/demo/resources/det.jpg --work-dir mmdeploy_models/mmdetection/yolox --device cuda --dump-info

However, I still can not inference with python sdk

img = cv2.imread('mmdetection/demo/demo.jpg')
onnx_path = 'mmdeploy_models/mmdetection/yolox'
detector = Detector(onnx_path,device_name='cuda',device_id=0)
print(detector)
result = detector(img)

The error are follow

(open-mmyolo) wushuchen@wushuchen:~/projects/mmyolo$ python main.py 
[2023-03-22 11:17:47.870] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "mmdeploy_models/mmdetection/yolox"
[2023-03-22 11:17:47.870] [mmdeploy] [error] [common.cpp:67] Device "cuda" not found
Traceback (most recent call last):
  File "main.py", line 53, in <module>
    detector = Detector(onnx_path,device_name='cuda',device_id=0)
RuntimeError: failed to create detector

Besides, I want to use the model in cpu platform, why I can't convert it to cpu?

hanrui1sensetime commented 1 year ago

Thanks for your reply. After set the device to cuda with follow command, I can convert the model. python mmdeploy/tools/deploy.py mmdeploy/configs/mmdet/detection/detection_onnxruntime_dynamic.py mmdetection/configs/yolox/yolox_s_8xb8-300e_coco.py ~/.cache/torch/hub/checkpoints/yolox_s_8x8_300e_coco_20211121_095711-4592a793.pth mmdeploy/demo/resources/det.jpg --work-dir mmdeploy_models/mmdetection/yolox --device cuda --dump-info

However, I still can not inference with python sdk

img = cv2.imread('mmdetection/demo/demo.jpg')
onnx_path = 'mmdeploy_models/mmdetection/yolox'
detector = Detector(onnx_path,device_name='cuda',device_id=0)
print(detector)
result = detector(img)

The error are follow

(open-mmyolo) wushuchen@wushuchen:~/projects/mmyolo$ python main.py 
[2023-03-22 11:17:47.870] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "mmdeploy_models/mmdetection/yolox"
[2023-03-22 11:17:47.870] [mmdeploy] [error] [common.cpp:67] Device "cuda" not found
Traceback (most recent call last):
  File "main.py", line 53, in <module>
    detector = Detector(onnx_path,device_name='cuda',device_id=0)
RuntimeError: failed to create detector

Besides, I want to use the model in cpu platform, why I can't convert it to cpu?

export PYTHONPATH=${MMDEPLOY_DIR}/build/lib:$PYTHONPATH
aixiaodewugege commented 1 year ago

Same error after added it.

(open-mmyolo) wushuchen@wushuchen: ~/projects/mmyolo$ export PYTHONPATH=~/projects/mmyolo/mmdeploy/build/lib:$PYTHONPATH

(open-mmyolo) wushuchen@wushuchen:~/projects/mmyolo$ python main.py [2023-03-22 11:27:54.537] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "mmdeploy_models/mmdetection/yolox" [2023-03-22 11:27:54.537] [mmdeploy] [error] [common.cpp:67] Device "cuda" not found Traceback (most recent call last): File "main.py", line 53, in detector = Detector(onnx_path,device_name='cuda',device_id=0) RuntimeError: failed to create detector

And still can't inference with cpu. Coule you please have a quick look? (open-mmyolo) wushuchen@wushuchen:~/projects/mmyolo$ python main.py [2023-03-22 11:31:24.766] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "mmdeploy_models/mmdetection/yolox_cpu" <mmdeploy_python.Detector object at 0x7f6aaa116cb0> /opt/rh/devtoolset-9/root/usr/include/c++/9/bits/stl_vector.h:1042: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = int; _Alloc = std::allocator; std::vector<_Tp, _Alloc>::reference = int&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed. Aborted (core dumped)

hanrui1sensetime commented 1 year ago

Same error after added it.

(open-mmyolo) wushuchen@wushuchen: ~/projects/mmyolo$ export PYTHONPATH=~/projects/mmyolo/mmdeploy/build/lib:$PYTHONPATH

(open-mmyolo) wushuchen@wushuchen:~/projects/mmyolo$ python main.py [2023-03-22 11:27:54.537] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "mmdeploy_models/mmdetection/yolox" [2023-03-22 11:27:54.537] [mmdeploy] [error] [common.cpp:67] Device "cuda" not found Traceback (most recent call last): File "main.py", line 53, in detector = Detector(onnx_path,device_name='cuda',device_id=0) RuntimeError: failed to create detector

add -DMMDEPLOY_TARGET_DEVICES="cpu; cuda" when compiling, you can refer this tutorial

aixiaodewugege commented 1 year ago

But I actually want to use the model in cpu. Could you please show me how to fix the cpu inference?

hanrui1sensetime commented 1 year ago

Install onnxruntime-gpu instead of onnxruntime although you only want to use the model in cpu can avoid this bug.

oym050922021 commented 1 year ago

But I actually want to use the model in cpu. Could you please show me how to fix the cpu inference?

@aixiaodewugege hello, May I ask how the problem was solved? I encountered the same problem, the model can not be converted in the cpu. Thank you!

aixiaodewugege commented 1 year ago

It is solved. It is some problems in your deploy config file. For me, it is the image size problem. Check it again~~

oym050922021 commented 1 year ago

@aixiaodewugege hi, thank you for your reply. the problem has been fixed.

jamestkpoon commented 1 year ago

@oym050922021 Can you please share your deploy config file? I seem to be also getting

/opt/rh/devtoolset-9/root/usr/include/c++/9/bits/stl_vector.h:1042: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = int; _Alloc = std::allocator; std::vector<_Tp, _Alloc>::reference = int&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__builtin_expect(__n < this->size(), true)' failed.