microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.57k stars 2.92k forks source link

.NET onnxruntime Protobuf parsing failed #20133

Open rghavimi opened 7 months ago

rghavimi commented 7 months ago

Describe the issue

Coming across a similar error when trying to use an ONNX model generated by optimum:

optimum-cli export onnx --model avsolatorio/GIST-small-Embedding-v0 --optimize O3 --device cpu --library-name transformers --trust-remote-code gist_quantized_fp16_O3_vcpu_tranformers/

Transformers version: 4.39.1 Platform: Linux-5.15.0-1055-aws-x86_64-with-glibc2.35 Python version: 3.10.12 Huggingface_hub version: 0.22.1 Safetensors version: 0.4.2 Accelerate version: 0.23.0 Accelerate config: not found PyTorch version (GPU?): 2.0.1+cu118 (True) Tensorflow version (GPU?): 2.14.0 (True) Flax version (CPU?/GPU?/TPU?): not installed (NA) Jax version: not installed JaxLib version: not installed onnx==1.15.0 onnxruntime-gpu==1.17.1 optimum==1.18.0

Getting the following error when trying when opening an InferenceSession using the model:

[ErrorCode:InvalidProtobuf] Load model from ... failed:Protobuf parsing failed.

To reproduce

Try opening an InferenceSession using the model (too large to attach here unfortunately), then you get the exception. Netron is fine with it.

Urgency

Expected behavior File should be able to be opened since Netron opens it correctly detecting "ONNX v4" format, or at least we should have a more descriptive error.

image

Have a project deadline that depends on this.

Platform

Windows

OS Version

Windows 10

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.17.1

ONNX Runtime API

C#

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

yuslepukhin commented 7 months ago

How large is your model? Can you share it via a file sharing service?

GeorgeS2019 commented 7 months ago

label this issue csharp?

rghavimi commented 7 months ago

Here is the model: https://www.transfernow.net/dl/20240329yKmybLR9

yuslepukhin commented 7 months ago

C# code does not parse protobuf. It is parsed by google library. I will investigate if we could make it a better error message. Said that, here is some python code that also runs onnx library checker and points to a problem with the model even before InferenceSession is invoked.

import onnxruntime
import numpy as np
import onnx

model_path='model.onnx'

model = onnx.load(model_path)
onnx.checker.check_model(model)

session_options = onnxruntime.SessionOptions()
session_options.log_severity_level = 4
session_options.enable_mem_reuse = False
sess = onnxruntime.InferenceSession(
    model,
    sess_options=session_options,
    providers=["CPUExecutionProvider"],
)

LayerNormalization op appeared in ONNX starting with opset 17.

https://github.com/onnx/onnx/blob/main/docs/Operators.md#LayerNormalization

Traceback (most recent call last): File "d:\dev\data\BadProtobuf\ort_optimize.py", line 8, in onnx.checker.check_model(model) File "c:\Python\lib\site-packages\onnx\checker.py", line 179, in check_model C.check_model( onnx.onnx_cpp2py_export.checker.ValidationError: No Op registered for LayerNormalization with domain_version of 11

==> Context: Bad node spec for node. Name: LayerNormalization OpType: LayerNormalization

yuslepukhin commented 7 months ago

Your model imports ONNX domain opset11

image
yuslepukhin commented 7 months ago

I will check with the converter folks to see what may have gone wrong.