open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.78k stars 638 forks source link

[Bug] failed to load library libmmdeploy_ort_net.so #2214

Closed spa-yhson closed 1 year ago

spa-yhson commented 1 year ago

Checklist

Describe the bug

When I run /demo/python/image_segmentation.py, it produces an error that it failed to load library libmmdeploy_ort_net.so .

I have created the environment by using the docker image provided.

Reproduction

  1. python ./demo/python/image_segmentation.py cuda /mnt/mmsegmentation/work_dirs/pidnet-l_2xb6-120k_1024x1024-cityscapes/end2end.engine cityscapes_demo.png

  2. I did not make any modifications on the code or config.

Environment

06/26 12:33:31 - mmengine - INFO - **********Environmental information**********
06/26 12:33:33 - mmengine - INFO - sys.platform: linux
06/26 12:33:33 - mmengine - INFO - Python: 3.8.16 (default, Mar  2 2023, 03:21:46) [GCC 11.2.0]
06/26 12:33:33 - mmengine - INFO - CUDA available: True
06/26 12:33:33 - mmengine - INFO - numpy_random_seed: 2147483648
06/26 12:33:33 - mmengine - INFO - GPU 0,1,2,3: NVIDIA GeForce RTX 3090
06/26 12:33:33 - mmengine - INFO - CUDA_HOME: /usr/local/cuda
06/26 12:33:33 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.6, V11.6.124
06/26 12:33:33 - mmengine - INFO - GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
06/26 12:33:33 - mmengine - INFO - PyTorch: 1.10.0
06/26 12:33:33 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

06/26 12:33:33 - mmengine - INFO - TorchVision: 0.11.0
06/26 12:33:33 - mmengine - INFO - OpenCV: 4.7.0
06/26 12:33:33 - mmengine - INFO - MMEngine: 0.7.3
06/26 12:33:33 - mmengine - INFO - MMCV: 2.0.0
06/26 12:33:33 - mmengine - INFO - MMCV Compiler: GCC 9.3
06/26 12:33:33 - mmengine - INFO - MMCV CUDA Compiler: 11.3
06/26 12:33:33 - mmengine - INFO - MMDeploy: 1.1.0+faf05fe
06/26 12:33:33 - mmengine - INFO - 

06/26 12:33:33 - mmengine - INFO - **********Backend information**********
06/26 12:33:33 - mmengine - INFO - tensorrt:    8.2.4.2
06/26 12:33:33 - mmengine - INFO - tensorrt custom ops: Available
06/26 12:33:33 - mmengine - INFO - ONNXRuntime: None
06/26 12:33:33 - mmengine - INFO - ONNXRuntime-gpu: 1.8.1
06/26 12:33:33 - mmengine - INFO - ONNXRuntime custom ops:  Available
06/26 12:33:33 - mmengine - INFO - pplnn:   None
06/26 12:33:33 - mmengine - INFO - ncnn:    None
06/26 12:33:33 - mmengine - INFO - snpe:    None
06/26 12:33:33 - mmengine - INFO - openvino:    None
06/26 12:33:33 - mmengine - INFO - torchscript: 1.10.0
06/26 12:33:33 - mmengine - INFO - torchscript custom ops:  NotAvailable
06/26 12:33:33 - mmengine - INFO - rknn-toolkit:    None
06/26 12:33:33 - mmengine - INFO - rknn-toolkit2:   None
06/26 12:33:33 - mmengine - INFO - ascend:  None
06/26 12:33:33 - mmengine - INFO - coreml:  None
06/26 12:33:33 - mmengine - INFO - tvm: None
06/26 12:33:33 - mmengine - INFO - vacc:    None
06/26 12:33:33 - mmengine - INFO - 

06/26 12:33:33 - mmengine - INFO - **********Codebase information**********
06/26 12:33:33 - mmengine - INFO - mmdet:   None
06/26 12:33:33 - mmengine - INFO - mmseg:   1.0.0
06/26 12:33:33 - mmengine - INFO - mmpretrain:  None
06/26 12:33:33 - mmengine - INFO - mmocr:   None
06/26 12:33:33 - mmengine - INFO - mmagic:  None
06/26 12:33:33 - mmengine - INFO - mmdet3d: None
06/26 12:33:33 - mmengine - INFO - mmpose:  None
06/26 12:33:33 - mmengine - INFO - mmrotate:    None
06/26 12:33:33 - mmengine - INFO - mmaction:    None
06/26 12:33:33 - mmengine - INFO - mmrazor: None

Error traceback

python ./demo/python/image_segmentation.py cuda /mnt/mmsegmentation/work_dirs/pidnet-l_2xb6-120k_1024x1024-cityscapes/end2end.engine cityscapes_demo.png
loading libmmdeploy_trt_net.so ...
loading libmmdeploy_ort_net.so ...
failed to load library libmmdeploy_ort_net.so
[ WARN:0@4.479] global loadsave.cpp:244 findDecoder imread_('cityscapes_demo.png'): can't open/read file: check file path/integrity
[2023-06-26 12:30:51.244] [mmdeploy] [error] [model.cpp:40] Failed to load model: "/mnt/mmsegmentation/work_dirs/pidnet-l_2xb6-120k_1024x1024-cityscapes/end2end.engine", implementations tried: [("DirectoryModel", 0)]
[2023-06-26 12:30:51.244] [mmdeploy] [error] [model.cpp:16] Failed to load model "/mnt/mmsegmentation/work_dirs/pidnet-l_2xb6-120k_1024x1024-cityscapes/end2end.engine"
[2023-06-26 12:30:51.244] [mmdeploy] [error] [model.cpp:21] failed to create model: not supported (2) @ /__w/mmdeploy/mmdeploy/csrc/mmdeploy/core/model.cpp:41
Traceback (most recent call last):
  File "./demo/python/image_segmentation.py", line 54, in <module>
    main()
  File "./demo/python/image_segmentation.py", line 35, in main
    segmentor = Segmentor(
RuntimeError: failed to create segmentor
irexyc commented 1 year ago

If seems you are using tensorrt backend. Therefore, you can ignore failed to load library libmmdeploy_ort_net.so.

The log shows two errors.

[ WARN:0@4.479] global loadsave.cpp:244 findDecoder imread_('cityscapes_demo.png'): can't open/read file: check file path/integrity

Please check path of cityscapes_demo.png, the error indicate that it can't read the image.

[2023-06-26 12:30:51.244] [mmdeploy] [error] [model.cpp:40] Failed to load model: "/mnt/mmsegmentation/work_dirs/pidnet-l_2xb6-120k_1024x1024-cityscapes/end2end.engine", implementations tried: [("DirectoryModel", 0)]

You should pass the path of model folder instead of the path of engine file. Try /mnt/mmsegmentation/work_dirs/pidnet-l_2xb6-120k_1024x1024-cityscapes/ (don't forget pass --dump-info when you convert the model)

spa-yhson commented 1 year ago

I confirmed that cityscapes_demo.png file is certainly located in the defined path. Also, changing the path to the model folder yields the same error.

irexyc commented 1 year ago

@spa-yhson

image_segmentation.py accept three positional arguments device_name, model_path and image_path

error 1:

[ WARN:0@4.479] global loadsave.cpp:244 findDecoder imread_('cityscapes_demo.png'): can't open/read file: check file path/integrity

This error is printed by cv2, you can check if the img is ndarray or None this line: https://github.com/open-mmlab/mmdeploy/blob/main/demo/python/image_segmentation.py#L33

If you are sure the image path is right, then you need to reinstall opencv to make sure you can read image.

error 2: you should pass the model folder instead of engine file, according to your error log, you should pass /mnt/mmsegmentation/work_dirs/pidnet-l_2xb6-120k_1024x1024-cityscapes/

the sdk need not only the backend file (.engine for trt, etc.), but also some json files that describe the model and pipeline. the model structure shoud be like:

.
├── deploy.json
├── detail.json
├── end2end.engine
├── end2end.onnx
└── pipeline.json

If you model directory doesn't contains there json files, you shoud reconvert the model and pass --dump-info to deploy.py

spa-yhson commented 1 year ago

Thank you so much for your help. The issue resolved.