openvinotoolkit / mmdetection

OpenVINO Training Extensions Object Detection
https://github.com/opencv/openvino_training_extensions
Other
93 stars 31 forks source link

Fix ONNX -> IR data type conversion issue #345

Closed eugene123tw closed 2 years ago

eugene123tw commented 2 years ago

Issue

[ ERROR ]  -------------------------------------------------
[ ERROR ]  ----------------- INTERNAL ERROR ----------------
[ ERROR ]  Unexpected exception happened.
[ ERROR ]  Please contact Model Optimizer developers and forward the following information:
[ ERROR ]  While validating ONNX node '<Node(BatchNormalization): BatchNormalization_66>':
Check 'element::Type::merge(et_result, et_result, inp.m_element_type)' failed at core/src/validation_util.cpp:545:
While validating node 'v5::BatchNormInference BatchNormInference_1708 (612[0]:f16{1,32,512,512}, backbone.features.init_block.conv.bn.weight[0]:f32{32}, backbone.features.init_block.conv.bn.bias[0]:f32{32}, backbone.features.init_block.conv.bn.running_mean[0]:f32{32}, backbone.features.init_block.conv.bn.running_var[0]:f32{32}) -> (dynamic...)' with friendly_name 'BatchNormInference_1708':
Input element types do not match.

[ ERROR ]  Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/openvino/tools/mo/main.py", line 529, in main
    ret_code = driver(argv)
  File "/usr/local/lib/python3.8/site-packages/openvino/tools/mo/main.py", line 485, in driver
    graph, ngraph_function = prepare_ir(argv)
  File "/usr/local/lib/python3.8/site-packages/openvino/tools/mo/main.py", line 390, in prepare_ir
    ngraph_function = moc_pipeline(argv, moc_front_end)
  File "/usr/local/lib/python3.8/site-packages/openvino/tools/mo/moc_frontend/pipeline.py", line 147, in moc_pipeline
    ngraph_function = moc_front_end.convert(input_model)
RuntimeError: While validating ONNX node '<Node(BatchNormalization): BatchNormalization_66>':
Check 'element::Type::merge(et_result, et_result, inp.m_element_type)' failed at core/src/validation_util.cpp:545:
While validating node 'v5::BatchNormInference BatchNormInference_1708 (612[0]:f16{1,32,512,512}, backbone.features.init_block.conv.bn.weight[0]:f32{32}, backbone.features.init_block.conv.bn.bias[0]:f32{32}, backbone.features.init_block.conv.bn.running_mean[0]:f32{32}, backbone.features.init_block.conv.bn.running_var[0]:f32{32}) -> (dynamic...)' with friendly_name 'BatchNormInference_1708':
Input element types do not match.

[ ERROR ]  ---------------- END OF BUG REPORT --------------
[ ERROR ]  -------------------------------------------------

Fix

PyTorch model, at some point, data with a different data type gets into the attributes of ONNX graph.

And for some nodes (e.g. BatchNormalization) there is a mismatch in data types between the input tensor and the stored parameters. In addition, there are problems with other ops in the graph as well. For example, Cast Op, should convert the data to float16. But at runtime base_anchors type was float32 and it was written to the graph.

Semen has implemented a fix for the ONNX graph so data type can be correctly processed by MO.

That said, this fix is a temporary measure, and should be correctly fixed from mmdetection side.

eugene123tw commented 2 years ago

run ote_sdk tests

eugene123tw commented 2 years ago

run ote-test