schyun9212 / maskrcnn-benchmark

Converting maskrcnn-benchmark model to TorchScript or ONNX
MIT License
2 stars 0 forks source link

failed to load exported roi module #9

Open schyun9212 opened 4 years ago

schyun9212 commented 4 years ago

🐛 Bug

I successfully exported roi_head module but failed to load model.

To Reproduce

import torch
import io
import unittest

from maskrcnn_benchmark.structures.image_list import ImageList

from demo.unittest.onnx.export import ONNXExportTester, ONNX_OPSET_VERSION, VALIDATION_TYPE, cfg, coco_demo, sample_features, sample_proposals, t_width, t_height

class ROITester(ONNXExportTester):
    def test_roi(self):
        from maskrcnn_benchmark.structures.bounding_box import BoxList

        class ROI(torch.nn.Module):
            def __init__(self):
                super(ROI, self).__init__()

            def forward(self, features, proposals):
                bbox, objectness = proposals

                proposals = BoxList(bbox, (t_width, t_height), mode="xyxy")
                proposals.add_field("objectenss", objectness)

                _, result, _ = coco_demo.model.roi_heads(features, [proposals])

                result = (result[0].bbox,
                        result[0].get_field("labels"),
                        result[0].get_field("mask"),
                        result[0].get_field("scores"))

                return result

        roi = ROI()
        roi.eval()

        inputs, outputs = self.run_model(roi, (sample_features, sample_proposals))

        if VALIDATION_TYPE == "IO":
            onnx_io = io.BytesIO()
        else:
            onnx_io = "./demo/onnx_test_models/roi.onnx"

        torch.onnx.export(roi, inputs, onnx_io,
                            verbose=False,
                            do_constant_folding=False,
                            input_names=["feature_0", "feature_1", "feature_2", "feature_3", "feature_4", "bbox", "objectness"],
                            opset_version=ONNX_OPSET_VERSION)

        self.ort_validate(onnx_io, inputs, outputs)

if __name__ == '__main__':
    unittest.main()

Expected behavior

/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/maskrcnn_benchmark/structures/bounding_box.py:27: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if bbox.size(-1) != 4:
/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/maskrcnn_benchmark/modeling/poolers.py:84: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  for i, b in enumerate(boxes)
/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/maskrcnn_benchmark/modeling/poolers.py:106: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  num_rois = len(rois)
/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/maskrcnn_benchmark/modeling/roi_heads/box_head/inference.py:62: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  boxes_per_image = [len(box) for box in boxes]
/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/maskrcnn_benchmark/modeling/roi_heads/box_head/inference.py:122: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  for j in range(1, num_classes):
/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/maskrcnn_benchmark/modeling/roi_heads/box_head/inference.py:131: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  num_labels = len(boxlist_for_class)
/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/maskrcnn_benchmark/modeling/roi_heads/box_head/inference.py:138: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  number_of_detections = len(result)
/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/maskrcnn_benchmark/modeling/roi_heads/mask_head/inference.py:47: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  boxes_per_image = [len(box) for box in boxes]
/home/jade/.pyenv/versions/maskrcnn-benchmark-1-3-1/lib/python3.7/site-packages/torch/onnx/symbolic_opset9.py:1881: UserWarning: Exporting aten::index operator of advanced indexing in opset 10 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results.
  "If indices include negative values, the exported graph will produce incorrect results.")
E
======================================================================
ERROR: test_roi (__main__.ROITester)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/demo/unittest/onnx/export/roi.py", line 48, in test_roi
    self.ort_validate(onnx_io, inputs, outputs)
  File "/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/demo/unittest/onnx/export/__init__.py", line 73, in ort_validate
    ort_session = onnxruntime.InferenceSession(onnx_io)
  File "/home/jade/.pyenv/versions/maskrcnn-benchmark-1-3-1/lib/python3.7/site-packages/onnxruntime/capi/session.py", line 25, in __init__
    self._load_model(providers)
  File "/home/jade/.pyenv/versions/maskrcnn-benchmark-1-3-1/lib/python3.7/site-packages/onnxruntime/capi/session.py", line 43, in _load_model
    self._sess.load_model(providers)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node () Op (ConstantOfShape) [ShapeInferenceError] Invalid shape value: 0

----------------------------------------------------------------------
Ran 1 test in 20.761s

FAILED (errors=1)

Environment

PyTorch version: 1.3.1 Is debug build: No CUDA used to build PyTorch: 10.1.243

OS: Ubuntu 18.04.3 LTS GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 CMake version: version 3.10.2

Python version: 3.7 Is CUDA available: Yes CUDA runtime version: 10.1.243 GPU models and configuration: GPU 0: GeForce RTX 2080 Ti Nvidia driver version: 440.48.02 cuDNN version: Probably one of the following: /usr/local/cuda-10.0/targets/x86_64-linux/lib/libcudnn.so.7 /usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so.7 /usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn.so.7

Versions of relevant libraries: [pip3] numpy==1.18.1 [pip3] onnx==1.6.0 [pip3] onnxruntime==1.1.0 [pip3] Pillow==6.2.2 [pip3] torch==1.3.1 [pip3] torchvision==0.4.2 [conda] Could not collect

schyun9212 commented 4 years ago

There are some tries to compare equality between float and integer in feature_extractor. They caused error

Traceback (most recent call last):
  File "/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/demo/unittest/onnx/export/feature_extractor.py", line 45, in test_feature_extractor
    self.ort_validate(onnx_io, inputs, outputs)
  File "/home/jade/Workspace/maskrcnn/maskrcnn-benchmark-1-3-1/demo/unittest/onnx/export/__init__.py", line 73, in ort_validate
    ort_session = onnxruntime.InferenceSession(onnx_io)
  File "/home/jade/.pyenv/versions/maskrcnn-benchmark-1-3-1/lib/python3.7/site-packages/onnxruntime/capi/session.py", line 25, in __init__
    self._load_model(providers)
  File "/home/jade/.pyenv/versions/maskrcnn-benchmark-1-3-1/lib/python3.7/site-packages/onnxruntime/capi/session.py", line 43, in _load_model
    self._sess.load_model(providers)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Type Error: Type 'tensor(float)' of input parameter (44) of operator (Equal) in node () is invalid.

Screenshot from 2020-01-31 16-48-09