microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.59k stars 2.92k forks source link

Get wrong constant folding result when calling InferenceSession #8422

Open yjydfnhc opened 3 years ago

yjydfnhc commented 3 years ago

Describe the bug I created an onnx model and saved it with save_as_external_data = True, then loaded it into memory with load_external_data= False, then pass model.SerializeToString() to onnxrt.InferenceSession() to do constant folding. The generated onnx file of the folded model will have wrong value for the folded part.

e.g. in the following example model, image

to fold node_cf, the expected result of tensor Z1 be: image , while I get the wrong result of Z1: image or something like: image

To Reproduce example code:

import numpy as np
import onnx
import onnxruntime as rt
from onnx import helper
from onnx import TensorProto

#======== 1. create an onnx model

dtype = np.float32
x = np.ones((3, 3), dtype)

data_w = np.ones((3, 3), dtype)
w = helper.make_tensor(name = 'w', data_type = TensorProto.FLOAT, dims = data_w.shape, vals = data_w.flatten().astype(dtype).tobytes(), raw=True)

V = helper.make_tensor_value_info('V', TensorProto.FLOAT, [3, 3])
Z = helper.make_tensor_value_info('Z', TensorProto.FLOAT, [3, 3])

# Create node (NodeProto)
X = helper.make_node(
    'Constant',
    inputs=[],
    outputs=['X'],
    value=onnx.helper.make_tensor(
        name='const_tensor_x',
        data_type=onnx.TensorProto.FLOAT,
        dims=x.shape,
        vals=x.flatten().astype(float),
    ),
)
node_cf = helper.make_node(
    'Gemm',             
    ['w', 'X'], # inputs
    ['Z1'],                  # outputs
    name = 'tocf'
)
node_sum = helper.make_node(
    'Sum',                  # name
    ['Z1', 'V'], # inputs
    ['Z'],                  # outputs
)
# Create the graph (GraphProto)
graph_def = helper.make_graph(
    [X, node_cf, node_sum],        # nodes
    'test-model',      # name
    [V],  # inputs
    [Z],  # outputs
    initializer = [w],
)
model_def = helper.make_model(graph_def, producer_name='onnx-example')

# #======== 2. save the onnx model
# (0) save model with data (to get correct result)
path_full = 'model_full.onnx'
onnx.save_model(model_def, path_full
                , save_as_external_data=False)
# (1) save model with separate data files  (will get the wrong result)
path_sep = 'model_sep.onnx'
onnx.save_model(model_def, path_sep
                , save_as_external_data=True
                , all_tensors_to_one_file=False
                , size_threshold=0)

# #======== 3. load and do constant folding
# # (1) will get wrong result: the generated file "model_sep_cf.onnx" will have tensor Z1 with wrong values
model = onnx.load(path_sep, load_external_data= False) # without data
sess_options = rt.SessionOptions()
sess_options.graph_optimization_level = rt.GraphOptimizationLevel.ORT_ENABLE_BASIC
sess_options.optimized_model_filepath = path_sep.replace(".onnx", "_cf_no_data.onnx")
session = rt.InferenceSession(model.SerializeToString(), sess_options) # wrong: Z1 with 1-values

# (2) the expected result: the generated file "model_full_cf.onnx" will have tensor Z1 with correct values
model2 = onnx.load(path_full)
sess_options2 = rt.SessionOptions()
sess_options2.graph_optimization_level = rt.GraphOptimizationLevel.ORT_ENABLE_BASIC
sess_options2.optimized_model_filepath = path_full.replace(".onnx", "_cf.onnx")
session2 = rt.InferenceSession(model2.SerializeToString(), sess_options2) # as expected: Z1 with 3-values

System information

Expected behavior

  1. get correct constant folding results;
  2. for a model with large tensors, load data on demand (i.e. only load the tensor when it is required for the process);

Thank you

stale[bot] commented 2 years ago

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.