open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.74k stars 629 forks source link

[Bug] RuntimeError: NYI: Named tensors are not supported with the tracer #1727

Open caj-github opened 1 year ago

caj-github commented 1 year ago

Checklist

Describe the bug

when i run tools/deploy.py to deploy mmocr abinet, report a error RuntimeError: NYI: Named tensors are not supported with the tracer

Reproduction

python tools/deploy.py configs/mmocr/text-recognition/text-recognition_tensorrt_static-32x128.py E:/code/mmocr-1.0.0rc5/configs/textrecog/abinet/abinet_20e_st-an_mj.py E:\code\test\abinet_20e_st-an_mj_20221005_012617-ead8c139.pth E:\code\test\crop_0.jpg

Environment

02/08 10:09:21 - mmengine - INFO - **********Environmental information**********
02/08 10:09:25 - mmengine - INFO - sys.platform: win32
02/08 10:09:25 - mmengine - INFO - Python: 3.8.16 (default, Jan 17 2023, 22:25:28) [MSC v.1916 64 bit (AMD64)]
02/08 10:09:25 - mmengine - INFO - CUDA available: True
02/08 10:09:25 - mmengine - INFO - numpy_random_seed: 2147483648
02/08 10:09:25 - mmengine - INFO - GPU 0: NVIDIA GeForce RTX 3090
02/08 10:09:25 - mmengine - INFO - CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3
02/08 10:09:25 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.109
02/08 10:09:25 - mmengine - INFO - MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.29.30147 版
02/08 10:09:25 - mmengine - INFO - GCC: n/a
02/08 10:09:25 - mmengine - INFO - PyTorch: 1.12.1+cu113
02/08 10:09:25 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - C++ Version: 199711
  - MSVC 192829337
  - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 2019
  - LAPACK is enabled (usually provided by MKL)
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.3.2  (built against CUDA 11.5)
  - Magma 2.5.4
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/builder/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,

02/08 10:09:25 - mmengine - INFO - TorchVision: 0.13.1+cu113
02/08 10:09:25 - mmengine - INFO - OpenCV: 4.7.0
02/08 10:09:25 - mmengine - INFO - MMEngine: 0.5.0
02/08 10:09:25 - mmengine - INFO - MMCV: 2.0.0rc3
02/08 10:09:25 - mmengine - INFO - MMCV Compiler: MSVC 192829924
02/08 10:09:25 - mmengine - INFO - MMCV CUDA Compiler: 11.3
02/08 10:09:25 - mmengine - INFO - MMDeploy: 1.0.0rc1+unknown
02/08 10:09:25 - mmengine - INFO -

02/08 10:09:25 - mmengine - INFO - **********Backend information**********
02/08 10:09:25 - mmengine - INFO - tensorrt:    8.4.0.6
02/08 10:09:25 - mmengine - INFO - tensorrt custom ops: Available
02/08 10:09:25 - mmengine - INFO - ONNXRuntime: None
02/08 10:09:25 - mmengine - INFO - ONNXRuntime-gpu:     1.12.1
02/08 10:09:25 - mmengine - INFO - ONNXRuntime custom ops:      Available
02/08 10:09:25 - mmengine - INFO - pplnn:       None
02/08 10:09:25 - mmengine - INFO - ncnn:        None
02/08 10:09:25 - mmengine - INFO - snpe:        None
02/08 10:09:25 - mmengine - INFO - openvino:    None
02/08 10:09:25 - mmengine - INFO - torchscript: 1.12.1+cu113
02/08 10:09:25 - mmengine - INFO - torchscript custom ops:      NotAvailable
02/08 10:09:25 - mmengine - INFO - rknn-toolkit:        None
02/08 10:09:25 - mmengine - INFO - rknn-toolkit2:       None
02/08 10:09:25 - mmengine - INFO - ascend:      None
02/08 10:09:25 - mmengine - INFO - coreml:      None
02/08 10:09:25 - mmengine - INFO - tvm: None
02/08 10:09:25 - mmengine - INFO -

02/08 10:09:25 - mmengine - INFO - **********Codebase information**********
02/08 10:09:25 - mmengine - INFO - mmdet:       3.0.0rc5
02/08 10:09:25 - mmengine - INFO - mmseg:       1.0.0rc3
02/08 10:09:25 - mmengine - INFO - mmcls:       1.0.0rc5
02/08 10:09:25 - mmengine - INFO - mmocr:       1.0.0rc5
02/08 10:09:25 - mmengine - INFO - mmedit:      None
02/08 10:09:25 - mmengine - INFO - mmdet3d:     None
02/08 10:09:25 - mmengine - INFO - mmpose:      None
02/08 10:09:25 - mmengine - INFO - mmrotate:    None
02/08 10:09:25 - mmengine - INFO - mmaction:    None

Error traceback

Traceback (most recent call last):
  File "D:\anaconda3\envs\mmcv\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "D:\anaconda3\envs\mmcv\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\apis\core\pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\apis\pytorch2onnx.py", line 98, in torch2onnx
    export(
  File "e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\apis\core\pipeline_manager.py", line 356, in _wrap
    return self.call_function(func_name_, *args, **kwargs)
  File "e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\apis\core\pipeline_manager.py", line 326, in call_function
    return self.call_function_local(func_name, *args, **kwargs)
  File "e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\apis\core\pipeline_manager.py", line 275, in call_function_local
    return pipe_caller(*args, **kwargs)
  File "e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\apis\core\pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\apis\onnx\export.py", line 131, in export
    torch.onnx.export(
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\onnx\__init__.py", line 350, in export
    return utils.export(
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\onnx\utils.py", line 163, in export
    _export(
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\onnx\utils.py", line 1074, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\apis\onnx\optimizer.py", line 11, in model_to_graph__custom_optimizer
    graph, params_dict, torch_out = ctx.origin_func(*args, **kwargs)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\onnx\utils.py", line 727, in _model_to_graph
    graph, params, torch_out, module = _create_jit_graph(model, args)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\onnx\utils.py", line 602, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\onnx\utils.py", line 517, in _trace_and_get_graph_from_model
    trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\jit\_trace.py", line 1175, in _get_trace_graph
    outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\jit\_trace.py", line 127, in forward
    graph, out = torch._C._create_graph_by_tracing(
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\jit\_trace.py", line 118, in wrapper
    outs.append(self.inner(*trace_inputs))
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1118, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\apis\onnx\export.py", line 123, in wrapper
    return forward(*arg, **kwargs)
  File "e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\codebase\mmocr\models\text_recognition\encoder_decoder_recognizer.py", line 36, in encoder_decoder_recognizer__forward 
    return self.decoder.predict(feat, out_enc, data_samples)
  File "e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\codebase\mmocr\models\text_recognition\base_decoder.py", line 29, in base_decoder__forward
    out_dec = self(feat, out_enc, data_samples)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1118, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\mmocr\models\textrecog\decoders\base.py", line 166, in forward
    return self.forward_test(feat, out_enc, data_samples)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\mmocr\models\textrecog\decoders\abi_fuser.py", line 145, in forward_test
    raw_result = self.forward_train(feat, logits, data_samples)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\mmocr\models\textrecog\decoders\abi_fuser.py", line 114, in forward_train
    out_dec = self.language_decoder(feat, text_logits,
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1118, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\mmocr\models\textrecog\decoders\base.py", line 166, in forward
    return self.forward_test(feat, out_enc, data_samples)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\mmocr\models\textrecog\decoders\abi_language_decoder.py", line 180, in forward_test
    return self.forward_train(feat, logits, data_samples)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\mmocr\models\textrecog\decoders\abi_language_decoder.py", line 146, in forward_train
    output = m(
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\nn\modules\module.py", line 1118, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\mmcv\cnn\bricks\transformer.py", line 809, in forward
    attn_masks = [
  File "D:\anaconda3\envs\mmcv\lib\site-packages\mmcv\cnn\bricks\transformer.py", line 810, in <listcomp>
    copy.deepcopy(attn_masks) for _ in range(self.num_attn)
  File "D:\anaconda3\envs\mmcv\lib\copy.py", line 153, in deepcopy
    y = copier(memo)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\_tensor.py", line 110, in __deepcopy__
    new_storage = self.storage().__deepcopy__(memo)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\storage.py", line 569, in __deepcopy__
    return self._new_wrapped_storage(copy.deepcopy(self._storage, memo))
  File "D:\anaconda3\envs\mmcv\lib\copy.py", line 153, in deepcopy
    y = copier(memo)
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\storage.py", line 89, in __deepcopy__
    new_storage = self.clone()
  File "D:\anaconda3\envs\mmcv\lib\site-packages\torch\storage.py", line 103, in clone
    return type(self)(self.nbytes(), device=self.device).copy_(self)
RuntimeError: NYI: Named tensors are not supported with the tracer
02/08 10:15:58 - mmengine - ERROR - e:\dnntrain\environment\mmdeploy-dev-1.x\mmdeploy\apis\core\pipeline_manager.py - pop_mp_output - 80 - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.
AllentDan commented 1 year ago

Could not reproduce the error with MMCV 2.0.0rc3, MMDet v1.0.0rc5 and MMOCR 1.0.0rc3. Would it be the platform problem? I used Ubuntu 18.04.

smrlehdgus commented 1 year ago

same error in ubuntu 20.04

AllentDan commented 1 year ago

Fixed in #2319