open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.77k stars 636 forks source link

[Bug] c++ SDK error on inference text recognition: invalid argument (1) #1750

Closed ThinkWD closed 1 year ago

ThinkWD commented 1 year ago

Checklist

Describe the bug

C++ SDK dome ocr.cpp built on version 0.11.0, executing to mmdeploy_text_recognizer_apply_bbox with error:

terminate called after throwing an instance of 'system_error2::status_error<mmdeploy::StatusDomain>'
  what():  invalid argument (1) @ /root/workspace/mmdeploy/csrc/mmdeploy/device/cuda/cuda_device.cpp:171
Aborted (core dumped)

Reproduction

The dockerfile used to build the docker: Dockerfile.zip

I commented out the build sdk section of the dockerfile and built the sdk manually using the following command:

cmake .. \
    -D CMAKE_CXX_COMPILER=g++ \
    -D MMDEPLOY_SHARED_LIBS=ON \
    -D MMDEPLOY_BUILD_SDK=ON \
    -D MMDEPLOY_BUILD_SDK_CXX_API=ON \
    -D MMDEPLOY_BUILD_EXAMPLES=ON \
    -D MMDEPLOY_TARGET_DEVICES="cuda;cpu" \
    -D MMDEPLOY_TARGET_BACKENDS="ort;trt" \
    -D MMDEPLOY_CODEBASES=all \
    -D TENSORRT_DIR=${TENSORRT_DIR} \
    -D ONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} \
    -D pplcv_DIR=/root/workspace/mmdeploy/pplcv/lib/cmake/ppl

After convert the model I executed the following command to run the demo

cd /root/workspace/mmdeploy/build/install/bin
./ocr cuda /root/workspace/mmdeploy/checkpoints/dbnet/TensorRT /root/workspace/mmdeploy/checkpoints/crnn/TensorRT /root/workspace/mmdeploy/checkpoints/dbnet/dbnet.jpg

A docker image that can reproduce the above problem, It contains the model files I used. 链接: https://pan.baidu.com/s/1IhGF5JOT8p_NS1lsOBgYaA?pwd=8u59 提取码: 8u59

Environment

2023-02-13 05:51:59,048 - mmdeploy - INFO - **********Environmental information**********
2023-02-13 05:51:59,219 - mmdeploy - INFO - sys.platform: linux
2023-02-13 05:51:59,219 - mmdeploy - INFO - Python: 3.8.16 (default, Jan 17 2023, 23:13:24) [GCC 11.2.0]
2023-02-13 05:51:59,219 - mmdeploy - INFO - CUDA available: True
2023-02-13 05:51:59,219 - mmdeploy - INFO - GPU 0: NVIDIA GeForce RTX 3080
2023-02-13 05:51:59,219 - mmdeploy - INFO - CUDA_HOME: /usr/local/cuda
2023-02-13 05:51:59,219 - mmdeploy - INFO - NVCC: Cuda compilation tools, release 11.6, V11.6.124
2023-02-13 05:51:59,219 - mmdeploy - INFO - GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
2023-02-13 05:51:59,219 - mmdeploy - INFO - PyTorch: 1.10.0
2023-02-13 05:51:59,219 - mmdeploy - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

2023-02-13 05:51:59,219 - mmdeploy - INFO - TorchVision: 0.11.0
2023-02-13 05:51:59,219 - mmdeploy - INFO - OpenCV: 4.7.0
2023-02-13 05:51:59,219 - mmdeploy - INFO - MMCV: 1.5.3
2023-02-13 05:51:59,219 - mmdeploy - INFO - MMCV Compiler: GCC 7.3
2023-02-13 05:51:59,219 - mmdeploy - INFO - MMCV CUDA Compiler: 11.3
2023-02-13 05:51:59,219 - mmdeploy - INFO - MMDeploy: 0.11.0+2a1fed9
2023-02-13 05:51:59,219 - mmdeploy - INFO - 

2023-02-13 05:51:59,219 - mmdeploy - INFO - **********Backend information**********
2023-02-13 05:51:59,642 - mmdeploy - INFO - onnxruntime: 1.8.1  ops_is_avaliable : True
2023-02-13 05:51:59,665 - mmdeploy - INFO - tensorrt: 8.2.4.2   ops_is_avaliable : True
2023-02-13 05:51:59,676 - mmdeploy - INFO - ncnn: None  ops_is_avaliable : False
2023-02-13 05:51:59,677 - mmdeploy - INFO - pplnn_is_avaliable: False
2023-02-13 05:51:59,678 - mmdeploy - INFO - openvino_is_avaliable: False
2023-02-13 05:51:59,689 - mmdeploy - INFO - snpe_is_available: False
2023-02-13 05:51:59,690 - mmdeploy - INFO - ascend_is_available: False
2023-02-13 05:51:59,691 - mmdeploy - INFO - coreml_is_available: False
2023-02-13 05:51:59,691 - mmdeploy - INFO - 

2023-02-13 05:51:59,691 - mmdeploy - INFO - **********Codebase information**********
2023-02-13 05:51:59,691 - mmdeploy - INFO - mmdet:      None
2023-02-13 05:51:59,691 - mmdeploy - INFO - mmseg:      None
2023-02-13 05:51:59,691 - mmdeploy - INFO - mmcls:      None
2023-02-13 05:51:59,691 - mmdeploy - INFO - mmocr:      None
2023-02-13 05:51:59,691 - mmdeploy - INFO - mmedit:     None
2023-02-13 05:51:59,691 - mmdeploy - INFO - mmdet3d:    None
2023-02-13 05:51:59,691 - mmdeploy - INFO - mmpose:     None
2023-02-13 05:51:59,691 - mmdeploy - INFO - mmrotate:   None
2023-02-13 05:51:59,691 - mmdeploy - INFO - mmaction:   None

Error traceback

[2023-02-13 05:54:27.133] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "/root/workspace/mmdeploy/checkpoints/dbnet/TensorRT"
[2023-02-13 05:54:27.184] [mmdeploy] [info] [inference.cpp:44] {
  "context": {
    "device": "<any>",
    "model": "<any>",
    "stream": "<any>"
  },
  "pipeline": {
    "input": [
      "img"
    ],
    "output": [
      "post_output"
    ],
    "tasks": [
      {
        "fuse_transform": false,
        "input": [
          "img"
        ],
        "module": "Transform",
        "name": "Preprocess",
        "output": [
          "prep_output"
        ],
        "sha256": "83ba9eb66901a32e1fe5ebcff0a6375706597472e185e1d94aee2043a7399d3b",
        "transforms": [
          {
            "color_type": "color_ignore_orientation",
            "type": "LoadImageFromFile"
          },
          {
            "keep_ratio": true,
            "size": [
              736,
              1333
            ],
            "type": "Resize"
          },
          {
            "mean": [
              123.675,
              116.28,
              103.53
            ],
            "std": [
              58.395,
              57.12,
              57.375
            ],
            "to_rgb": true,
            "type": "Normalize"
          },
          {
            "size_divisor": 32,
            "type": "Pad"
          },
          {
            "type": "DefaultFormatBundle"
          },
          {
            "keys": [
              "img"
            ],
            "meta_keys": [
              "valid_ratio",
              "ori_shape",
              "scale_factor",
              "filename",
              "img_norm_cfg",
              "pad_shape",
              "ori_filename",
              "img_shape",
              "flip_direction",
              "flip"
            ],
            "type": "Collect"
          }
        ],
        "type": "Task"
      },
      {
        "input": [
          "prep_output"
        ],
        "input_map": {
          "img": "input"
        },
        "is_batched": true,
        "module": "Net",
        "name": "dbnet",
        "output": [
          "infer_output"
        ],
        "output_map": {},
        "type": "Task"
      },
      {
        "component": "DBHead",
        "input": [
          "prep_output",
          "infer_output"
        ],
        "module": "mmocr",
        "name": "postprocess",
        "output": [
          "post_output"
        ],
        "params": {
          "in_channels": 256,
          "loss": {
            "alpha": 5.0,
            "bbce_loss": true,
            "beta": 10.0,
            "type": "DBLoss"
          },
          "postprocessor": {
            "text_repr_type": "poly",
            "type": "DBPostprocessor"
          }
        },
        "type": "Task"
      }
    ]
  }
}
[2023-02-13 05:54:28.051] [mmdeploy] [info] [inference.cpp:56] ["img"] <- ["img"]
[2023-02-13 05:54:28.051] [mmdeploy] [info] [inference.cpp:67] ["post_output"] -> ["dets"]
[2023-02-13 05:54:28.051] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "/root/workspace/mmdeploy/checkpoints/crnn/TensorRT"
[2023-02-13 05:54:28.052] [mmdeploy] [info] [inference.cpp:44] {
  "context": {
    "device": "<any>",
    "model": "<any>",
    "stream": "<any>"
  },
  "pipeline": {
    "input": [
      "img"
    ],
    "output": [
      "post_output"
    ],
    "tasks": [
      {
        "input": [
          "img"
        ],
        "module": "Transform",
        "name": "Preprocess",
        "output": [
          "prep_output"
        ],
        "transforms": [
          {
            "color_type": "grayscale",
            "type": "LoadImageFromFile"
          },
          {
            "height": 32,
            "keep_aspect_ratio": true,
            "max_width": null,
            "min_width": 32,
            "type": "ResizeOCR"
          },
          {
            "mean": [
              127
            ],
            "std": [
              127
            ],
            "type": "Normalize"
          },
          {
            "type": "DefaultFormatBundle"
          },
          {
            "keys": [
              "img"
            ],
            "meta_keys": [
              "ori_filename",
              "scale_factor",
              "img_norm_cfg",
              "valid_ratio",
              "flip",
              "flip_direction",
              "ori_shape",
              "img_shape",
              "resize_shape",
              "filename",
              "pad_shape"
            ],
            "type": "Collect"
          }
        ],
        "type": "Task"
      },
      {
        "input": [
          "prep_output"
        ],
        "input_map": {
          "img": "input"
        },
        "is_batched": true,
        "module": "Net",
        "name": "crnnnet",
        "output": [
          "infer_output"
        ],
        "output_map": {},
        "type": "Task"
      },
      {
        "component": "CTCConvertor",
        "input": [
          "prep_output",
          "infer_output"
        ],
        "module": "mmocr",
        "name": "postprocess",
        "output": [
          "post_output"
        ],
        "params": {
          "dict_list": [
            "0",
            "1",
            "2",
            "3",
            "4",
            "5",
            "6",
            "7",
            "8",
            "9",
            ".",
            "-",
            "A",
            "C"
          ],
          "lower": false,
          "with_unknown": true
        },
        "type": "Task"
      }
    ]
  }
}
[2023-02-13 05:54:28.171] [mmdeploy] [info] [inference.cpp:56] ["img"] <- ["patches"]
[2023-02-13 05:54:28.171] [mmdeploy] [info] [inference.cpp:67] ["post_output"] -> ["texts"]
bbox_count=10
terminate called after throwing an instance of 'system_error2::status_error<mmdeploy::StatusDomain>'
  what():  invalid argument (1) @ /root/workspace/mmdeploy/csrc/mmdeploy/device/cuda/cuda_device.cpp:171
Aborted (core dumped)
irexyc commented 1 year ago

The container file is too large. Can you only share the two model files and dbnet.jpg

ThinkWD commented 1 year ago

The container file is too large. Can you only share the two model files and dbnet.jpg

链接: https://pan.baidu.com/s/1x4m6SsRhs1YV0IFuvF_IDw?pwd=ya3v 提取码: ya3v

and command:

python3 tools/deploy.py \
configs/mmocr/text-detection/text-detection_tensorrt_dynamic-320x320-2240x2240.py \
checkpoints/dbnet/dbnet.py \
checkpoints/dbnet/dbnet.pth \
checkpoints/dbnet/dbnet.jpg \
--work-dir checkpoints/dbnet/TensorRT \
--device cuda:0 \
--dump-info
python3 tools/deploy.py \
configs/mmocr/text-recognition/text-recognition_tensorrt_dynamic-1x32x32-1x32x640.py \
checkpoints/crnn/crnn.py \
checkpoints/crnn/crnn.pth \
checkpoints/crnn/crnn.jpg \
--work-dir checkpoints/crnn/TensorRT \
--device cuda:0 \
--dump-info
irexyc commented 1 year ago

The problem is duo to different tensor shape received by crnn. https://github.com/open-mmlab/mmdeploy/pull/1668 fixed this problem, you can try lastest code.

ThinkWD commented 1 year ago

Thank you for your time! I will try it.