open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.79k stars 638 forks source link

Converting UniFormerV2 model to tensorrt fail #2024

Closed xenocloud7732 closed 1 year ago

xenocloud7732 commented 1 year ago

Checklist

Describe the bug

I am trying to use mmdeploy to convert the UniFormerV2 model to Tensorrt format.The current deploy file I am using is " \configs\mmaction\video-recognition\video-recognition_3d_tensorrt_static-224x224.py" in the mmdeploy project path. The file(video-recognition_3d_tensorrt_static-224x224.py) content is as follows

1682238412274

The config file of the model is “uniformerv2-base-p16-res224_clip_u8_kinetics400-rgb.py ” which is included in mmaction2 project.The path is“configs\recognition\uniformerv2\uniformerv2-base-p16-res224_clip_u8_kinetics400-rgb.py ”

The ckpt file of the model is “uniformerv2-base-p16-res224_clip_8xb32-u8_kinetics400-rgb_20230313-e29fc968.pth” which is download from openmmlab.

I don't know if UniFormerV2 currently supports deployment to Tensorrt in mmdeploy.So I tried using mmdeploy for conversion, but errors( "Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!") were exposed during the execution process.

Reproduction

python tools/deploy.py configs/mmaction/video-recognition/video-recognition_3d_tensorrt_static-224x224.py\ F:/open-mmlab/mmaction2/configs/recognition/uniformerv2/uniformerv2-base-p16-res224_clip_u8_kinetics400-rgb.py\ uniformerv2-base-p16-res224_clip_8xb32-u8_kinetics400-rgb_20230313-e29fc968.pth\ tests/data/arm_wrestling.mp4\ --work-dir mmdeploy_models/mmaction/uniformerv2\ --device cuda\ --show\ --dump-info

Environment

mmaction2  1.0.0      https://github.com/open-mmlab/mmaction2
mmcv       2.0.0rc4   https://github.com/open-mmlab/mmcv
mmdet      3.0.0rc6   https://github.com/open-mmlab/mmdetection
mmengine   0.7.0      https://github.com/open-mmlab/mmengine
mmpose     1.0.0rc1   f:\open-mmlab\mmpose
mmtrack    0.14.0     https://github.com/open-mmlab/mmtracking
mmyolo     0.5.0      https://github.com/open-mmlab/mmyolo
mmdeploy   1.0.0rc3   f:\open-mmlab\mmdeploy

Error traceback


Loads checkpoint by local backend from path: uniformerv2-base-p16-res224_clip_8xb32-u8_kinetics400-rgb_20230313-e29fc968.pth
04/23 16:27:47 - mmengine - WARNING - DeprecationWarning: get_onnx_config will be deprecated in the future.
04/23 16:27:47 - mmengine - INFO - Export PyTorch model to ONNX: mmdeploy_models/mmaction/uniformerv2\end2end.onnx.
C:\Users\Administrator\.conda\envs\RTMPose\lib\site-packages\mmaction\models\backbones\uniformerv2.py:366: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  H = W = int((L - 1)**0.5)
f:\open-mmlab\mmdeploy\mmdeploy\pytorch\functions\tensor_setitem.py:38: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  stop = stop if stop >= 0 else self_shape[i] + stop
f:\open-mmlab\mmdeploy\mmdeploy\pytorch\functions\tensor_setitem.py:43: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  elif out.numel() == 1:
C:\Users\Administrator\.conda\envs\RTMPose\lib\site-packages\torch\onnx\utils.py:687: UserWarning: Constant folding in symbolic shape inference fails: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\passes\onnx\shape_type_inference.cpp:413.)
  _C._jit_pass_onnx_graph_shape_type_inference(
Process Process-2:
Traceback (most recent call last):
  File "C:\Users\Administrator\.conda\envs\RTMPose\lib\multiprocessing\process.py", line 315, in _bootstrap
    self.run()
  File "C:\Users\Administrator\.conda\envs\RTMPose\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "f:\open-mmlab\mmdeploy\mmdeploy\apis\core\pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "f:\open-mmlab\mmdeploy\mmdeploy\apis\pytorch2onnx.py", line 98, in torch2onnx
    export(
  File "f:\open-mmlab\mmdeploy\mmdeploy\apis\core\pipeline_manager.py", line 356, in _wrap
    return self.call_function(func_name_, *args, **kwargs)
  File "f:\open-mmlab\mmdeploy\mmdeploy\apis\core\pipeline_manager.py", line 326, in call_function
    return self.call_function_local(func_name, *args, **kwargs)
  File "f:\open-mmlab\mmdeploy\mmdeploy\apis\core\pipeline_manager.py", line 275, in call_function_local
    return pipe_caller(*args, **kwargs)
  File "f:\open-mmlab\mmdeploy\mmdeploy\apis\core\pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "f:\open-mmlab\mmdeploy\mmdeploy\apis\onnx\export.py", line 131, in export
    torch.onnx.export(
  File "C:\Users\Administrator\.conda\envs\RTMPose\lib\site-packages\torch\onnx\utils.py", line 504, in export
    _export(
  File "C:\Users\Administrator\.conda\envs\RTMPose\lib\site-packages\torch\onnx\utils.py", line 1529, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "f:\open-mmlab\mmdeploy\mmdeploy\apis\onnx\optimizer.py", line 11, in model_to_graph__custom_optimizer
    graph, params_dict, torch_out = ctx.origin_func(*args, **kwargs)
  File "C:\Users\Administrator\.conda\envs\RTMPose\lib\site-packages\torch\onnx\utils.py", line 1172, in _model_to_graph    params_dict = _C._jit_pass_onnx_constant_fold(
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
04/23 16:27:53 - mmengine - ERROR - f:\open-mmlab\mmdeploy\mmdeploy\apis\core\pipeline_manager.py - pop_mp_output - 80 - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.
RunningLeon commented 1 year ago

@xenocloud7732 hi, have not supported uniformerv2 based on this model list.

xenocloud7732 commented 1 year ago

@RunningLeon Thank you for your answer,I did read the document before and this model is not on the support list. The recognition rate of slowfast is currently somewhat poor

RunningLeon commented 1 year ago

@xenocloud7732 Thanks for your feedback. We may include this in future work.

github-actions[bot] commented 1 year ago

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.