open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.61k stars 601 forks source link

[Bug] NYI: Named tensors are not supported with the tracer #2769

Open d710055071 opened 1 month ago

d710055071 commented 1 month ago

Checklist

Describe the bug

NYI: Named tensors are not supported with the tracer

Reproduction

Environment

05/20 15:17:28 - mmengine - INFO - 

05/20 15:17:28 - mmengine - INFO - **********Environmental information**********
05/20 15:17:29 - mmengine - INFO - sys.platform: linux
05/20 15:17:29 - mmengine - INFO - Python: 3.8.19 (default, Mar 20 2024, 19:58:24) [GCC 11.2.0]
05/20 15:17:29 - mmengine - INFO - CUDA available: True
05/20 15:17:29 - mmengine - INFO - MUSA available: False
05/20 15:17:29 - mmengine - INFO - numpy_random_seed: 2147483648
05/20 15:17:29 - mmengine - INFO - GPU 0: NVIDIA GeForce RTX 3060
05/20 15:17:29 - mmengine - INFO - CUDA_HOME: /usr/local/cuda
05/20 15:17:29 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.1, V11.1.105
05/20 15:17:29 - mmengine - INFO - GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
05/20 15:17:29 - mmengine - INFO - PyTorch: 1.12.1+cu113
05/20 15:17:29 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.4  (built against CUDA 11.6)
    - Built with CuDNN 8.3.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 

05/20 15:17:29 - mmengine - INFO - TorchVision: 0.13.1+cu113
05/20 15:17:29 - mmengine - INFO - OpenCV: 4.9.0
05/20 15:17:29 - mmengine - INFO - MMEngine: 0.10.4
05/20 15:17:29 - mmengine - INFO - MMCV: 2.1.0
05/20 15:17:29 - mmengine - INFO - MMCV Compiler: GCC 9.3
05/20 15:17:29 - mmengine - INFO - MMCV CUDA Compiler: 11.3
05/20 15:17:29 - mmengine - INFO - MMDeploy: 1.3.1+
05/20 15:17:29 - mmengine - INFO - 

05/20 15:17:29 - mmengine - INFO - **********Backend information**********
05/20 15:17:29 - mmengine - INFO - tensorrt:    None
05/20 15:17:29 - mmengine - INFO - ONNXRuntime: 1.17.3
05/20 15:17:29 - mmengine - INFO - ONNXRuntime-gpu:     None
05/20 15:17:29 - mmengine - INFO - ONNXRuntime custom ops:      Available
05/20 15:17:29 - mmengine - INFO - pplnn:       None
05/20 15:17:29 - mmengine - INFO - ncnn:        None
05/20 15:17:29 - mmengine - INFO - snpe:        None
05/20 15:17:29 - mmengine - INFO - openvino:    None
05/20 15:17:29 - mmengine - INFO - torchscript: 1.12.1+cu113
05/20 15:17:29 - mmengine - INFO - torchscript custom ops:      NotAvailable
05/20 15:17:29 - mmengine - INFO - rknn-toolkit:        None
05/20 15:17:29 - mmengine - INFO - rknn-toolkit2:       None
05/20 15:17:29 - mmengine - INFO - ascend:      None
05/20 15:17:29 - mmengine - INFO - coreml:      None
05/20 15:17:29 - mmengine - INFO - tvm: None
05/20 15:17:29 - mmengine - INFO - vacc:        None
05/20 15:17:29 - mmengine - INFO - 

05/20 15:17:29 - mmengine - INFO - **********Codebase information**********
05/20 15:17:29 - mmengine - INFO - mmdet:       3.3.0
05/20 15:17:29 - mmengine - INFO - mmseg:       1.2.2
05/20 15:17:29 - mmengine - INFO - mmpretrain:  None
05/20 15:17:29 - mmengine - INFO - mmocr:       None
05/20 15:17:29 - mmengine - INFO - mmagic:      None
05/20 15:17:29 - mmengine - INFO - mmdet3d:     None
05/20 15:17:29 - mmengine - INFO - mmpose:      None
05/20 15:17:29 - mmengine - INFO - mmrotate:    None
05/20 15:17:29 - mmengine - INFO - mmaction:    None
05/20 15:17:29 - mmengine - INFO - mmrazor:     None
05/20 15:17:29 - mmengine - INFO - mmyolo:      None

Error traceback

05/20 15:13:15 - mmengine - INFO - Start pipeline mmdeploy.apis.pytorch2onnx.torch2onnx in subprocess
05/20 15:13:16 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "Codebases" registry tree. As a workaround, the current "Codebases" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
05/20 15:13:16 - mmengine - WARNING - Failed to search registry with scope "mmseg" in the "mmseg_tasks" registry tree. As a workaround, the current "mmseg_tasks" registry in "mmdeploy" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmseg" is a correct scope, or whether the registry is initialized.
Loads checkpoint by local backend from path: /home/dongzf/workspace/commit_code/PerceptionInference_20240429/PerceptionInference/resource/areial/seg_area.pth
05/20 15:13:17 - mmengine - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future. 
05/20 15:13:17 - mmengine - INFO - Export PyTorch model to ONNX: ./work/end2end.onnx.
05/20 15:13:17 - mmengine - WARNING - Can not find torch.nn.functional.scaled_dot_product_attention, function rewrite will not be applied
05/20 15:13:17 - mmengine - WARNING - Can not find torch._C._jit_pass_onnx_autograd_function_process, function rewrite will not be applied
/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/core/optimizers/function_marker.py:160: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  ys_shape = tuple(int(s) for s in ys.shape)
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/utils/embed.py:62: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  output_h = math.ceil(input_h / stride_h)
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/utils/embed.py:63: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  output_w = math.ceil(input_w / stride_w)
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/utils/embed.py:64: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  pad_h = max((output_h - 1) * stride_h +
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/utils/embed.py:66: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  pad_w = max((output_w - 1) * stride_w +
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/utils/embed.py:72: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if pad_h > 0 or pad_w > 0:
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/backbones/swin.py:183: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert L == H * W, 'input feature has wrong size'
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/backbones/swin.py:281: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  x = x.view(B, H // window_size, window_size, W // window_size,
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/backbones/swin.py:91: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  C // self.num_heads).permute(2, 0, 3, 1, 4)
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/backbones/swin.py:266: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  B = int(windows.shape[0] / (H * W / window_size / window_size))
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/backbones/swin.py:267: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  x = windows.view(B, H // window_size, W // window_size, window_size,
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/backbones/swin.py:248: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if pad_r > 0 or pad_b:
/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/pytorch/functions/tensor_setitem.py:38: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  stop = stop if stop >= 0 else self_shape[i] + stop
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/backbones/swin.py:109: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  attn = attn.view(B // nW, nW, self.num_heads, N,
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/utils/embed.py:306: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert L == H * W, 'input feature has wrong size'
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/utils/embed.py:319: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  out_h = (H + 2 * self.sampler.padding[0] - self.sampler.dilation[0] *
/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/utils/embed.py:322: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  out_w = (W + 2 * self.sampler.padding[1] - self.sampler.dilation[1] *
Backend TkAgg is interactive backend. Turning interactive mode on.
Process Process-2:
Traceback (most recent call last):
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/apis/pytorch2onnx.py", line 98, in torch2onnx
    export(
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/apis/core/pipeline_manager.py", line 356, in _wrap
    return self.call_function(func_name_, *args, **kwargs)
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/apis/core/pipeline_manager.py", line 326, in call_function
    return self.call_function_local(func_name, *args, **kwargs)
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/apis/core/pipeline_manager.py", line 275, in call_function_local
    return pipe_caller(*args, **kwargs)
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/apis/onnx/export.py", line 138, in export
    torch.onnx.export(
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/onnx/__init__.py", line 350, in export
    return utils.export(
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/onnx/utils.py", line 163, in export
    _export(
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/onnx/utils.py", line 1074, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/apis/onnx/optimizer.py", line 27, in model_to_graph__custom_optimizer
    graph, params_dict, torch_out = ctx.origin_func(*args, **kwargs)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/onnx/utils.py", line 727, in _model_to_graph
    graph, params, torch_out, module = _create_jit_graph(model, args)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/onnx/utils.py", line 602, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/onnx/utils.py", line 517, in _trace_and_get_graph_from_model
    trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/jit/_trace.py", line 1175, in _get_trace_graph
    outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/jit/_trace.py", line 127, in forward
    graph, out = torch._C._create_graph_by_tracing(
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/jit/_trace.py", line 118, in wrapper
    outs.append(self.inner(*trace_inputs))
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1118, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/apis/onnx/export.py", line 123, in wrapper
    return forward(*arg, **kwargs)
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/codebase/mmseg/models/segmentors/base.py", line 51, in base_segmentor__forward
    seg_logit = self.predict(inputs, data_samples)
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/codebase/mmseg/models/segmentors/encoder_decoder.py", line 26, in encoder_decoder__predict
    seg_logit = self.decode_head.predict(x, batch_img_metas, self.test_cfg)
  File "/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/decode_heads/mask2former_head.py", line 146, in predict
    batch_data_samples = [
  File "/mnt/sda/code/mmdeploy-1.3.1/onnxruntime/mmsegmentation-1.2.2/mmseg/models/decode_heads/mask2former_head.py", line 147, in <listcomp>
    SegDataSample(metainfo=metainfo) for metainfo in batch_img_metas
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmengine/structures/base_data_element.py", line 216, in __init__
    self.set_metainfo(metainfo=metainfo)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/mmengine/structures/base_data_element.py", line 231, in set_metainfo
    meta = copy.deepcopy(metainfo)
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/pytorch/functions/copy.py", line 17, in copy__default
    return ctx.origin_func(tensor, *args, **kwargs)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/copy.py", line 172, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/copy.py", line 264, in _reconstruct
    y = func(*args)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/copy.py", line 263, in <genexpr>
    args = (deepcopy(arg, memo) for arg in args)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/copy.py", line 210, in _deepcopy_tuple
    y = [deepcopy(a, memo) for a in x]
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/copy.py", line 210, in <listcomp>
    y = [deepcopy(a, memo) for a in x]
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/copy.py", line 153, in deepcopy
    y = copier(memo)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/_tensor.py", line 110, in __deepcopy__
    new_storage = self.storage().__deepcopy__(memo)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/storage.py", line 569, in __deepcopy__
    return self._new_wrapped_storage(copy.deepcopy(self._storage, memo))
  File "/mnt/sda/code/mmdeploy-1.3.1/mmdeploy/pytorch/functions/copy.py", line 17, in copy__default
    return ctx.origin_func(tensor, *args, **kwargs)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/copy.py", line 153, in deepcopy
    y = copier(memo)
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/storage.py", line 89, in __deepcopy__
    new_storage = self.clone()
  File "/home/dongzf/miniconda3/envs/mmdeploy/lib/python3.8/site-packages/torch/storage.py", line 103, in clone
    return type(self)(self.nbytes(), device=self.device).copy_(self)
RuntimeError: NYI: Named tensors are not supported with the tracer
05/20 15:14:20 - mmengine - ERROR - /mnt/sda/code/mmdeploy-1.3.1/mmdeploy/apis/core/pipeline_manager.py - pop_mp_output - 80 - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.
d710055071 commented 1 month ago
@FUNCTION_REWRITER.register_rewriter(func_name='copy.deepcopy')
def copy__default(tensor: Tensor, *args, **kwargs) -> Tensor:
    """Rewrite `copy.deepcopy` for default backend.

    Replace it with tensor.clone(), or may raise `NYI: Named tensors are not
    supported with the tracer`
    """
    ctx = FUNCTION_REWRITER.get_context()
    # if isinstance(tensor, Tensor) and args == () and kwargs == {}:
    if isinstance(tensor, Tensor):
        return tensor.clone()
    elif isinstance(tensor, dict):
        # from copy import deepcopy
        def deepcopy_dict(obj,memo={}):
            if isinstance(obj, dict):  
                # 如果obj是字典,则创建新的空字典并递归拷贝其中的值  
                copied_obj = {}  
                memo[id(obj)] = copied_obj  # 存储已拷贝的字典引用  
                for key, value in obj.items():  
                    copied_obj[deepcopy_dict(key, memo)] = deepcopy_dict(value, memo)  
                return copied_obj
            elif isinstance(obj, list):  
                # 如果obj是列表,则创建新的空列表并递归拷贝其中的元素  
                copied_obj = []  
                memo[id(obj)] = copied_obj  # 存储已拷贝的列表引用  
                for item in obj:  
                    copied_obj.append(deepcopy_dict(item, memo))  
                return copied_obj  
            elif isinstance(obj, set):  
                # 如果obj是集合,则创建新的空集合并递归拷贝其中的元素  
                copied_obj = set()  
                memo[id(obj)] = copied_obj  
                for item in obj:  
                    copied_obj.add(deepcopy_dict(item, memo))  
                return copied_obj  
            elif isinstance(obj, (int, float, complex, str, bytes, tuple, frozenset, type(None))):  
                # 如果obj是不可变类型,则直接返回  
                return obj  
            elif id(obj) in memo:  
                # 如果obj已经被拷贝过,则直接返回其拷贝  
                return memo[id(obj)]  
            else:  
                # 对于其他类型,尝试使用copy模块的deepcopy(如果需要)  
                try:  
                    # import copy  
                    return copy__default(obj, memo)  
                except Exception as e:  
                    raise TypeError(f"Unsupported type {type(obj)} in deepcopy") from e  

        return deepcopy_dict(tensor, *args, **kwargs) 
    else:
        pass
    return ctx.origin_func(tensor, *args, **kwargs)
RunningLeon commented 1 month ago

@RunningLeon

hi, sorry for the issue. This project is not actively maintained. Welcome to PR us to fix any bugs. Thanks for your understanding.

shiomi326 commented 1 week ago

@RunningLeon Thank you, that worked. Colud you do a PR?

d710055071 commented 1 week ago

@RunningLeon Thank you, that worked. Colud you do a PR?

这个代码没有严格经过测试,网上找的 只能做为临时方案,问题的原因是当是字典时如果不处理还是会调用对象重载的深拷贝函数导致

shiomi326 commented 1 week ago

@RunningLeon I see. tha's true.