open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.74k stars 629 forks source link

[Bug] convert mask2former pytorch model to tensorrt model failed #1728

Closed Fafa-DL closed 1 year ago

Fafa-DL commented 1 year ago

Checklist

Describe the bug

hi, I want to deploy the weight conversion of mask2former Swin-B (in22k) or mask2former R-50-D32, and the error is as follows, and It is worth mentioning that it is no problem to convert other models such as OCRNet

Process Process-2:
Traceback (most recent call last):
  File "/home/zz/anaconda3/envs/mmdeploy1/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/zz/anaconda3/envs/mmdeploy1/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "/home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmdeploy/mmdeploy/apis/pytorch2onnx.py", line 98, in torch2onnx
    export(
  File "/home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 356, in _wrap
    return self.call_function(func_name_, *args, **kwargs)
  File "/home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 326, in call_function
    return self.call_function_local(func_name, *args, **kwargs)
  File "/home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 275, in call_function_local
    return pipe_caller(*args, **kwargs)
  File "/home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmdeploy/mmdeploy/apis/core/pipeline_manager.py", line 107, in __call__
    ret = func(*args, **kwargs)
  File "/home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmdeploy/mmdeploy/apis/onnx/export.py", line 131, in export
    torch.onnx.export(
  File "/home/zz/anaconda3/envs/mmdeploy1/lib/python3.8/site-packages/torch/onnx/__init__.py", line 350, in export
    return utils.export(
  File "/home/zz/anaconda3/envs/mmdeploy1/lib/python3.8/site-packages/torch/onnx/utils.py", line 163, in export
    _export(
  File "/home/zz/anaconda3/envs/mmdeploy1/lib/python3.8/site-packages/torch/onnx/utils.py", line 1074, in _export
    graph, params_dict, torch_out = _model_to_graph(
  File "/home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmdeploy/mmdeploy/apis/onnx/optimizer.py", line 11, in model_to_graph__custom_optimizer
    graph, params_dict, torch_out = ctx.origin_func(*args, **kwargs)
  File "/home/zz/anaconda3/envs/mmdeploy1/lib/python3.8/site-packages/torch/onnx/utils.py", line 731, in _model_to_graph
    graph = _optimize_graph(
  File "/home/zz/anaconda3/envs/mmdeploy1/lib/python3.8/site-packages/torch/onnx/utils.py", line 308, in _optimize_graph
    graph = _C._jit_pass_onnx(graph, operator_export_type)
  File "/home/zz/anaconda3/envs/mmdeploy1/lib/python3.8/site-packages/torch/onnx/__init__.py", line 416, in _run_symbolic_function
    return utils._run_symbolic_function(*args, **kwargs)
  File "/home/zz/anaconda3/envs/mmdeploy1/lib/python3.8/site-packages/torch/onnx/utils.py", line 1421, in _run_symbolic_function
    raise symbolic_registry.UnsupportedOperatorError(
torch.onnx.symbolic_registry.UnsupportedOperatorError: Exporting the operator ::einsum to ONNX opset version 11 is not supported. Support for this operator was added in version 12, try exporting with this version.
02/08 11:33:29 - mmengine - ERROR - /home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmdeploy/mmdeploy/apis/core/pipeline_manager.py - pop_mp_output - 80 - `mmdeploy.apis.pytorch2onnx.torch2onnx` with Call id: 0 failed. exit.

Reproduction

The executed command is as follows

python mmdeploy/tools/deploy.py \
    mmdeploy/configs/mmseg/segmentation_tensorrt_static-512x1024.py \
    mmsegmentation/configs/mask2former/mask2former_r50_8xb2-90k_cityscapes-512x1024.py  \
    mmsegmentation/mask2former_r50_8xb2-90k_cityscapes-512x1024_20221202_140802-ffd9d750.pth \
    mmsegmentation/demo/demo.png \
    --work-dir ../mmdeploy_model/mask2former \
    --device cuda \
    --dump-info

and execute the following command in mmsegmentation/can get the correct result

python demo/image_demo.py demo/demo.png configs/mask2former/mask2former_r50_8xb2-90k_cityscapes-512x1024.py mask2former_r50_8xb2-90k_cityscapes-512x1024_20221202_140802-ffd9d750.pth --device cuda:0 --out-file result.jpg

Environment

ubuntu20.04, 
python 3.8, 
TensorRT-8.2.3.0, 
cuda11.4, 
cudnn8.4.1, and here is my pip lists

Package                       Version      Editable project location
----------------------------- ------------ -----------------------------------------------------------------
actionlib                     1.13.2
addict                        2.4.0
aenum                         3.1.11
angles                        1.9.13
bondpy                        1.8.6
camera-calibration            1.16.0
camera-calibration-parsers    1.12.0
catkin                        0.8.10
certifi                       2022.12.7
charset-normalizer            3.0.1
click                         8.1.3
colorama                      0.4.6
contourpy                     1.0.7
controller-manager            0.19.5
controller-manager-msgs       0.19.5
cv-bridge                     1.16.0
cycler                        0.11.0
diagnostic-analysis           1.11.0
diagnostic-common-diagnostics 1.11.0
diagnostic-updater            1.11.0
dill                          0.3.6
dynamic-reconfigure           1.7.3
fonttools                     4.38.0
gazebo_plugins                2.9.2
gazebo_ros                    2.9.2
gencpp                        0.6.5
geneus                        3.0.0
genlisp                       0.4.18
genmsg                        0.5.16
gennodejs                     2.0.2
genpy                         0.6.15
grpcio                        1.51.1
idna                          3.4
image-geometry                1.16.0
importlib-metadata            6.0.0
interactive-markers           1.12.0
joint-state-publisher         1.15.1
joint-state-publisher-gui     1.15.1
kiwisolver                    1.4.4
laser_geometry                1.6.7
Markdown                      3.4.1
markdown-it-py                2.1.0
matplotlib                    3.6.3
mdurl                         0.1.2
message-filters               1.15.14
mmcv                          2.0.0rc4
mmdeploy                      1.0.0rc1     /home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmdeploy
mmdet                         3.0.0rc5     /home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmdetection
mmengine                      0.5.0
mmsegmentation                1.0.0rc5     /home/zz/2T/18.04/RSrepo/DeepLearning/deploy1.x/mmsegmentation
model-index                   0.1.11
multiprocess                  0.70.14
numpy                         1.24.2
onnx                          1.12.0
opencv-python                 4.7.0.68
openmim                       0.3.6
ordered-set                   4.1.0
packaging                     23.0
pandas                        1.5.3
Pillow                        9.4.0
pip                           22.3.1
prettytable                   3.6.0
protobuf                      3.20.1
pycocotools                   2.0.6
Pygments                      2.14.0
pyparsing                     3.0.9
python-dateutil               2.8.2
python-qt-binding             0.4.4
pytz                          2022.7.1
PyYAML                        6.0
qt-dotgraph                   0.4.2
qt-gui                        0.4.2
qt-gui-cpp                    0.4.2
qt-gui-py-common              0.4.2
requests                      2.28.2
resource_retriever            1.12.7
rich                          13.3.1
rosbag                        1.15.14
rosboost-cfg                  1.15.8
rosclean                      1.15.8
roscreate                     1.15.8
rosgraph                      1.15.14
roslaunch                     1.15.14
roslib                        1.15.8
roslint                       0.12.0
roslz4                        1.15.14
rosmake                       1.15.8
rosmaster                     1.15.14
rosmsg                        1.15.14
rosnode                       1.15.14
rosparam                      1.15.14
rospy                         1.15.14
rosservice                    1.15.14
rostest                       1.15.14
rostopic                      1.15.14
rosunit                       1.15.8
roswtf                        1.15.14
rqt_action                    0.4.9
rqt_bag                       0.5.1
rqt_bag_plugins               0.5.1
rqt_console                   0.4.11
rqt_dep                       0.4.12
rqt_graph                     0.4.14
rqt_gui                       0.5.3
rqt_gui_py                    0.5.3
rqt_image_view                0.4.16
rqt_launch                    0.4.9
rqt_logger_level              0.4.11
rqt-moveit                    0.5.10
rqt_msg                       0.4.10
rqt_nav_view                  0.5.7
rqt_plot                      0.4.13
rqt_pose_view                 0.5.11
rqt_publisher                 0.4.10
rqt_py_common                 0.5.3
rqt_py_console                0.4.10
rqt-reconfigure               0.5.5
rqt-robot-dashboard           0.5.8
rqt-robot-monitor             0.5.14
rqt_robot_steering            0.5.12
rqt_runtime_monitor           0.5.9
rqt-rviz                      0.7.0
rqt_service_caller            0.4.10
rqt_shell                     0.4.11
rqt_srv                       0.4.9
rqt_tf_tree                   0.6.3
rqt_top                       0.4.10
rqt_topic                     0.4.13
rqt_web                       0.4.10
rviz                          1.14.14
scipy                         1.10.0
sensor-msgs                   1.13.1
setuptools                    65.6.3
six                           1.16.0
smach                         2.5.0
smach-ros                     2.5.0
smclib                        1.8.6
tabulate                      0.9.0
termcolor                     2.2.0
terminaltables                3.1.10
tf                            1.13.2
tf-conversions                1.13.2
tf2-geometry-msgs             0.7.5
tf2-kdl                       0.7.5
tf2-py                        0.7.5
tf2-ros                       0.7.5
topic-tools                   1.15.14
torch                         1.12.1+cu113
torchaudio                    0.12.1+cu113
torchvision                   0.13.1+cu113
typing_extensions             4.4.0
urllib3                       1.26.14
wcwidth                       0.2.6
wheel                         0.37.1
xacro                         1.14.13
yapf                          0.32.0
zipp                          3.12.1


### Error traceback

_No response_
RunningLeon commented 1 year ago

@Fafa-DL Hi, mask2former is not supported in mmdeploy according to the doc. The error suggests that you used torch.eisum operations in your model forward, you could remove them and try again. One example is like this: https://github.com/open-mmlab/mmdeploy/blob/bc1b6440cd4e9eb85d02a706bbe263aeec4e66f9/mmdeploy/codebase/mmseg/models/decode_heads/ema_head.py#L29-L45

Fafa-DL commented 1 year ago

@RunningLeon Hi, thanks, torch.einsum exist in mask2former_head.py, Is it modification the same as the example you provided? https://github.com/open-mmlab/mmsegmentation/blob/916ed2b2e208bf88dfd5180c79baa9883abb8f00/mmseg/models/decode_heads/mask2former_head.py#L131-L163

Fafa-DL commented 1 year ago

@RunningLeon I made the following modification

cls_score = F.softmax(mask_cls_results, dim=-1)[..., :-1]
mask_pred = mask_pred_results.sigmoid()
# seg_logits = torch.einsum('bqc, bqhw->bchw', cls_score, mask_pred)

cls_score = cls_score.transpose(1, 2)
cls_score = cls_score.unsqueeze(3)
cls_score = cls_score.unsqueeze(4)
mask_pred = mask_pred.unsqueeze(1)
seg_logits = torch.matmul(cls_score, mask_pred)

but got RuntimeError: CUDA out of memory. while executing seg_logits = torch.matmul(cls_score, mask_pred)

RunningLeon commented 1 year ago

@RunningLeon I made the following modification

cls_score = F.softmax(mask_cls_results, dim=-1)[..., :-1]
mask_pred = mask_pred_results.sigmoid()
# seg_logits = torch.einsum('bqc, bqhw->bchw', cls_score, mask_pred)

cls_score = cls_score.transpose(1, 2)
cls_score = cls_score.unsqueeze(3)
cls_score = cls_score.unsqueeze(4)
mask_pred = mask_pred.unsqueeze(1)
seg_logits = torch.matmul(cls_score, mask_pred)

but got RuntimeError: CUDA out of memory. while executing seg_logits = torch.matmul(cls_score, mask_pred)

Maybe you could try with 512x512 or device=cpu

Fafa-DL commented 1 year ago

@RunningLeon I made the following modification

cls_score = F.softmax(mask_cls_results, dim=-1)[..., :-1]
mask_pred = mask_pred_results.sigmoid()
# seg_logits = torch.einsum('bqc, bqhw->bchw', cls_score, mask_pred)

cls_score = cls_score.transpose(1, 2)
cls_score = cls_score.unsqueeze(3)
cls_score = cls_score.unsqueeze(4)
mask_pred = mask_pred.unsqueeze(1)
seg_logits = torch.matmul(cls_score, mask_pred)

but got RuntimeError: CUDA out of memory. while executing seg_logits = torch.matmul(cls_score, mask_pred)

Maybe you could try with 512x512 or device=cpu

In other words, using this method will make the memory larger and the calculation slower?

github-actions[bot] commented 1 year ago

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

github-actions[bot] commented 1 year ago

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.