Segmentation Fault when some of node outputs is empty

Chizkiyahu commented 11 months ago

Describe the issue

when onnx node have multiple output and some of them are empty names session.run raise Segmentation Fault

To reproduce

import onnx
import onnxruntime as ort
import numpy as np

def run(outputs):
    x = onnx.helper.make_tensor_value_info("x", onnx.TensorProto.FLOAT, (1, 20))
    y = onnx.helper.make_tensor_value_info("y", onnx.TensorProto.FLOAT, (1, 5))
    node = onnx.helper.make_node('Split', inputs=['x'], outputs=outputs, axis=-1, name='split', num_outputs=4)
    graph_def = onnx.helper.make_graph(nodes=[node], name='graph', inputs=[x], outputs=[y])
    model_def = onnx.helper.make_model(graph_def)
    onnx.checker.check_model(model_def, full_check=True)
    ort_session = ort.InferenceSession(model_def.SerializeToString())
    ort_session.run(None, {'x': np.random.randn(1, 20).astype(np.float32)})

o = ['1', 'y', '3', '4']
run(o)  # this works
print(f"node output: {o} pass")

o = ['', 'y', '', '']
print(f"node output: {o} fail")
run(o)  # this fails

Urgency

No response

Platform

Mac

OS Version

14.1.1

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.15.1

ONNX Runtime API

Python

Architecture

ARM64

Execution Provider

Default CPU

Execution Provider Library Version

No response

snnn commented 11 months ago

@edgchen1 , could you please help take a look? Probably the problem has been fixed in the main branch.

Chizkiyahu commented 11 months ago

@edgchen1 , could you please help take a look? Probably the problem has been fixed in the main branch.

I tested on 1.16.3 and is the same problem Did onnxruntime have nightly version?

snnn commented 11 months ago

Yes.

python -m pip install coloredlogs flatbuffers numpy packaging protobuf sympy
python -m pip install -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/ ort_nightly

snnn commented 11 months ago

I tried it. The problem still exists.

edgchen1 commented 11 months ago

I can repro in main. Looking into it. Thanks for the repro script.

edgchen1 commented 11 months ago

Here's the nullptr dereference (of output) when the output is not provided: https://github.com/microsoft/onnxruntime/blob/14a343441dcd530bec24e18e34c3c068993eb06c/onnxruntime/core/providers/cpu/tensor/split.cc#L161-L162

Note that the Split op is specified to have required outputs. https://github.com/onnx/onnx/blob/5be7f3164ba0b2c323813264ceb0ae7e929d2350/docs/Operators.md#split

An empty string as the output name would mean that the output is an unspecified optional output. https://github.com/onnx/onnx/blob/5be7f3164ba0b2c323813264ceb0ae7e929d2350/docs/IR.md?plain=1#L355

Given the spec, I think we can improve the way it fails but it would still be an error.

snnn commented 11 months ago

But we run model checker before that. Why model checker didn't catch the issue?

snnn commented 11 months ago

I agree with you that ONNX Runtime library should report an error instead of crash when it encountered a bad model.

edgchen1 commented 11 months ago

But we run model checker before that. Why model checker didn't catch the issue?

Good question.

@gramalingam, here we have variadic output (of Split) that has some missing (as if they were unspecified optional outputs) individual outputs. Should the model checker catch this?

microsoft / onnxruntime