microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.54k stars 2.91k forks source link

Model saved with offline basic optimizations will not load - ShapeInferenceError #21325

Open ivberg opened 3 months ago

ivberg commented 3 months ago

Describe the issue

We were attempting to use offline optimization to optimize a model, save it, then load it using later with optimizations disabled. https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html#onlineoffline-mode

However, the resulting model with basic optimizations results in an error and will not load with ShapeInferenceError

To reproduce

  1. We used Phi Silicia SLM model
  2. Save the model offline with basic optimizations - https://onnxruntime.ai/docs/performance/model-optimizations/graph-optimizations.html#basic-graph-optimizations
  3. In this case we wanted extended data support saved offline so here was the APIs used
    session_options.SetGraphOptimizationLevel(ORT_ENABLE_BASIC); 
    const ORTCHAR_T* optimized_model_path = ORT_TSTR("model.extdata.basic_opt.onnx");
    session_options.SetOptimizedModelFilePath(optimized_model_path);
    session_options.AddConfigEntry(kOrtSessionOptionsOptimizedModelExternalInitializersFileName, "model.extdata.basic_opt.onnx.data");
    session_options.AddConfigEntry(kOrtSessionOptionsOptimizedModelExternalInitializersMinSizeInBytes, "10");
  4. In another code after getting the saved offline model load it
    session_options.SetGraphOptimizationLevel(ORT_DISABLE_ALL);
    session = Ort::Session(env, filemodelpath, session_options);
  5. The program will exit or at least I didn't see an error
  6. If you enable logging you will see errors like this "Node (/model/attn_mask_reformat/input_ids_subgraph/Reshape) Op (Reshape) [ShapeInferenceError] Cannot parse data from external tensors. Please load external data into raw data for tensor: /model/attn_mask_reformat/input_ids_subgraph/Concat_3/output_0"

Urgency

This does block performance investigations, experiments, and workarounds to attempt to work around other ONNX bugs.

Platform

Windows

OS Version

11

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.18.1

ONNX Runtime API

C++

Architecture

ARM64

Execution Provider

Default CPU

Execution Provider Library Version

No response

yufenglee commented 3 months ago

Looks like a model serialization issue with external data file. @pranavsharma, could you please help take a look?