microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.24k stars 2.87k forks source link

Segmentation Fault when some of node outputs is empty #18601

Open Chizkiyahu opened 10 months ago

Chizkiyahu commented 10 months ago

Describe the issue

when onnx node have multiple output and some of them are empty names session.run raise Segmentation Fault

To reproduce

import onnx
import onnxruntime as ort
import numpy as np

def run(outputs):
    x = onnx.helper.make_tensor_value_info("x", onnx.TensorProto.FLOAT, (1, 20))
    y = onnx.helper.make_tensor_value_info("y", onnx.TensorProto.FLOAT, (1, 5))
    node = onnx.helper.make_node('Split', inputs=['x'], outputs=outputs, axis=-1, name='split', num_outputs=4)
    graph_def = onnx.helper.make_graph(nodes=[node], name='graph', inputs=[x], outputs=[y])
    model_def = onnx.helper.make_model(graph_def)
    onnx.checker.check_model(model_def, full_check=True)
    ort_session = ort.InferenceSession(model_def.SerializeToString())
    ort_session.run(None, {'x': np.random.randn(1, 20).astype(np.float32)})

o = ['1', 'y', '3', '4']
run(o)  # this works
print(f"node output: {o} pass")

o = ['', 'y', '', '']
print(f"node output: {o} fail")
run(o)  # this fails

Urgency

No response

Platform

Mac

OS Version

14.1.1

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.15.1

ONNX Runtime API

Python

Architecture

ARM64

Execution Provider

Default CPU

Execution Provider Library Version

No response

snnn commented 10 months ago

@edgchen1 , could you please help take a look? Probably the problem has been fixed in the main branch.

Chizkiyahu commented 10 months ago

@edgchen1 , could you please help take a look? Probably the problem has been fixed in the main branch.

I tested on 1.16.3 and is the same problem Did onnxruntime have nightly version?

snnn commented 10 months ago

Yes.

python -m pip install coloredlogs flatbuffers numpy packaging protobuf sympy
python -m pip install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ ort_nightly

image

snnn commented 10 months ago

I tried it. The problem still exists.

edgchen1 commented 10 months ago

I can repro in main. Looking into it. Thanks for the repro script.

edgchen1 commented 10 months ago

Here's the nullptr dereference (of output) when the output is not provided: https://github.com/microsoft/onnxruntime/blob/14a343441dcd530bec24e18e34c3c068993eb06c/onnxruntime/core/providers/cpu/tensor/split.cc#L161-L162

Note that the Split op is specified to have required outputs. https://github.com/onnx/onnx/blob/5be7f3164ba0b2c323813264ceb0ae7e929d2350/docs/Operators.md#split

An empty string as the output name would mean that the output is an unspecified optional output. https://github.com/onnx/onnx/blob/5be7f3164ba0b2c323813264ceb0ae7e929d2350/docs/IR.md?plain=1#L355

Given the spec, I think we can improve the way it fails but it would still be an error.

snnn commented 10 months ago

But we run model checker before that. Why model checker didn't catch the issue?

snnn commented 10 months ago

I agree with you that ONNX Runtime library should report an error instead of crash when it encountered a bad model.

edgchen1 commented 10 months ago

But we run model checker before that. Why model checker didn't catch the issue?

Good question.

@gramalingam, here we have variadic output (of Split) that has some missing (as if they were unspecified optional outputs) individual outputs. Should the model checker catch this?