microsoft / onnxconverter-common

Common utilities for ONNX converters
MIT License
250 stars 66 forks source link

convert_float_to_float16() produces a model that causes ValidationError with onnx.checker.check_model() #256

Open SergeySandler opened 1 year ago

SergeySandler commented 1 year ago

With ONNX 1.13.1, a fp32 model passes onnx.checker.check_model() without warnings or errors,

import onnx onnx_model = onnx.load("/models/ResNet50.onnx") onnx.checker.check_model(onnx_model)

but when converted into fp16 onnx.checker.check_model()

from onnxconverter_common import float16 onnx_model_fp16 = float16.convert_float_to_float16(onnx_model, keep_io_types = True) import warnings warnings.filterwarnings("ignore") onnx.checker.check_model(onnx_model_fp16)

triggers ValidationError

ValidationError: Nodes in a graph must be topologically sorted, however input 'graph_input_cast_0' of node: name: StatefulPartitionedCall/resnet50/conv1_conv/Conv2D__6 OpType: Transpose is not output of any previous nodes.

The ResNet50.onnx model is attached (as a multiple disk archive due to maximum size restriction, rename ResNet50.z01.zip into ResNet50.z01, ResNet50.z02.zip into ResNet50.z02, ResNet50.z03.zip into ResNet50.z03).

There is a separate issue https://github.com/microsoft/onnxruntime/issues/15494 about onnxruntime catastrophic failure when attempting to load the fp16 model that does not pass validation.

ResNet50.zip ResNet50.z01.zip ResNet50.z02.zip ResNet50.z03.zip

bilalsoomro commented 1 year ago

I'm facing the same issue. Are there any updates on this?

Here's a Colab notebook that reproduces the issue.

xiaowuhu commented 1 year ago

Suggest to skip check_model to have a try. Sometime check_model cannot pass, but inference works well. The check_model() function belongs to ONNX/ONNX code.

MaanavD commented 1 year ago

@bilalsoomro @SergeySandler Hope that helps - Xiaowu wrote the relevant packages you are importing :)

nitinimage commented 1 year ago

I'm also facing the same issue while using convert_float_to_float16(onnx_model, keep_io_types = True)

bilalsoomro commented 1 year ago

Suggest to skip check_model to have a try. Sometime check_model cannot pass, but inference works well. The check_model() function belongs to ONNX/ONNX code.

Hi @xiaowuhu, I tried to perform inference however I get the following error.

Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from /content/resnet50_fp16.onnx failed:/onnxruntime_src/onnxruntime/core/graph/graph.cc:1274 onnxruntime::Graph::Graph(const onnxruntime::Model&, onnx::GraphProto, const std::unordered_map<std::basic_string, int>&, onnxruntime::Version, onnxruntime::IOnnxRuntimeOpSchemaCollectionPtr, onnxruntime::Graph, const onnxruntime::Node*, const onnxruntime::logging::Logger&, bool) [ONNXRuntimeError] : 1 : FAIL : Tensor element type mismatch. 10 != 1

Did you get a chance to try this reproducible example? Colab link

Hurray0 commented 10 months ago

I'm also facing the same issue while using convert_float_to_float16(onnx_model, keep_io_types = True)

so do I. remove keep_io_types = True will be fine