open-mmlab / mmcv

OpenMMLab Computer Vision Foundation
https://mmcv.readthedocs.io/en/latest/
Apache License 2.0
5.81k stars 1.62k forks source link

[Bug] MMOCRInferencer(det='dbnetpp', rec='svtr-small', kie='SDMGR'). Roi_align_forward_impl support for gfx1030. #2787

Open SakhnevichKirill opened 1 year ago

SakhnevichKirill commented 1 year ago

Prerequisite

Environment

/bin/sh: 1: /opt/rocm-5.4.3/bin/nvcc: not found
/bin/sh: 1: /opt/rocm-5.4.3/bin/nvcc: not found
OrderedDict([('sys.platform', 'linux'), ('Python', '3.8.16 (default, Mar  2 2023, 03:21:46) [GCC 11.2.0]'), ('CUDA available', True), ('numpy_random_seed', 2147483648), ('GPU 0', 'AMD Radeon RX 6900 XT'), ('CUDA_HOME', '/opt/rocm-5.4.3'), ('NVCC', 'Not Available'), ('GCC', 'gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0'), ('PyTorch', '2.0.0+rocm5.4.2'), ('PyTorch compiling details', 'PyTorch built with:\n  - GCC 9.3\n  - C++ Version: 201703\n  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications\n  - Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)\n  - OpenMP 201511 (a.k.a. OpenMP 4.5)\n  - LAPACK is enabled (usually provided by MKL)\n  - NNPACK is enabled\n  - CPU capability usage: AVX2\n  - HIP Runtime 5.4.22803\n  - MIOpen 2.19.0\n  - Magma 2.6.1\n  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.0, USE_CUDA=OFF, USE_CUDNN=OFF, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=ON, \n'), ('TorchVision', '0.15.1+rocm5.4.2'), ('OpenCV', '4.7.0'), ('MMEngine', '0.7.3'), ('MMCV', '2.0.0rc4'), ('MMCV Compiler', 'GCC 9.4'), ('MMCV CUDA Compiler', '50422803')])

Reproduces the problem - code sample

import os
os.chdir('mmocr/')

from mmocr.apis import MMOCRInferencer
infer = MMOCRInferencer(det='dbnetpp', rec='svtr-small', kie='SDMGR')
result = infer('demo/demo_kie.jpeg', save_vis=True)

Reproduces the problem - command or script

I'm trying to run KIE model from tutorial on my local machine

Reproduces the problem - error message

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[15], line 3
      1 from mmocr.apis import MMOCRInferencer
      2 infer = MMOCRInferencer(det='dbnetpp', rec='svtr-small', kie='SDMGR')
----> 3 result = infer('demo/demo_kie.jpeg', save_vis=True)

File [~/workspace/python/ocr/mmocr/mmocr/apis/inferencers/mmocr_inferencer.py:315](https://vscode-remote+ssh-002dremote-002b95-002e165-002e88-002e39.vscode-resource.vscode-cdn.net/home/kirsr/workspace/python/ocr/~/workspace/python/ocr/mmocr/mmocr/apis/inferencers/mmocr_inferencer.py:315), in MMOCRInferencer.__call__(self, inputs, batch_size, det_batch_size, rec_batch_size, kie_batch_size, out_dir, return_vis, save_vis, save_pred, **kwargs)
    313 results = {'predictions': [], 'visualization': []}
    314 for ori_input in track(chunked_inputs, description='Inference'):
--> 315     preds = self.forward(
    316         ori_input,
    317         det_batch_size=det_batch_size,
    318         rec_batch_size=rec_batch_size,
    319         kie_batch_size=kie_batch_size,
    320         **forward_kwargs)
    321     visualization = self.visualize(
    322         ori_input, preds, img_out_dir=img_out_dir, **visualize_kwargs)
    323     batch_res = self.postprocess(
    324         preds,
    325         visualization,
    326         pred_out_dir=pred_out_dir,
    327         **postprocess_kwargs)

File [~/workspace/python/ocr/mmocr/mmocr/apis/inferencers/mmocr_inferencer.py:192](https://vscode-remote+ssh-002dremote-002b95-002e165-002e88-002e39.vscode-resource.vscode-cdn.net/home/kirsr/workspace/python/ocr/~/workspace/python/ocr/mmocr/mmocr/apis/inferencers/mmocr_inferencer.py:192), in MMOCRInferencer.forward(self, inputs, batch_size, det_batch_size, rec_batch_size, kie_batch_size, **forward_kwargs)
...
    103 ctx.save_for_backward(rois, argmax_y, argmax_x)
    104 return output

RuntimeError: roi_align_forward_impl: implementation for device cuda:0 not found.

Additional information

Roi_align_forward_impl support for gfx1030.

zhouzaida commented 1 year ago

Hi @SakhnevichKirill , how did you install the mmcv? MMCV does not provide the pre-build packages for ROCM so you need to build it from source.