Undefined behaviour in OneHot operator

Describe the issue

The OneHot operator CPU EP implementation features a division operation in calculating the output shape. When the indices input has a shape with a zero dimension, this results in UB here since divide by zero is undefined. On gcc/Linux I get a runtime floating point exception whereas on MacOS/clang 0 is propagated through.

In the ONNX specification for OneHot it states:

The rank of the output tensor will be one greater than the rank of the input tensor.

I believe the correct behaviour when having input indices of shape (0,) and depth with value k should be an output tensor of shape (0, k).

To reproduce

import spox.opset.ai.onnx.v17 as op
from spox import argument, build, Tensor 
import numpy as np
import onnxruntime as ort

if __name__ == "__main__":
    x = argument(Tensor(np.int64, ("N",)))
    cats = [1, 2]
    y = op.one_hot(x, op.const([len(cats)], dtype="int64"), op.const([0, 1], dtype="int64"))
    mp = build({"x": x}, {"y": y})

    s = ort.InferenceSession(mp.SerializeToString())
    out = s.run(None, {"x": np.array([], dtype="int64").reshape(0,)})
    print(out) # On MacOS I get [array([], shape=(0, 2), dtype=int64)] whereas on Linux I get a ``Floating point exception (core dumped)

Urgency

No response

Platform

Linux

OS Version

4.18

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.17.3

ONNX Runtime API

Python

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

LLDB step through on debug build on MacOS:

(lldb) n
Process 22526 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x000000015558db5c onnxruntime_pybind11_state.so`onnxruntime::PrepareOutputShape(indices=0x0000600002990000, depth_val=2, axis=-1, prefix_dim_size=0x000000016bc43030, suffix_dim_size=0x000000016bc43028, output_shape=0x000000016bc430d0) at onehot.cc:108:21
   105    for (int64_t i = 0; i < true_axis; ++i) {
   106      prefix_dim_size *= indices_dims[onnxruntime::narrow<size_t>(i)];
   107    }
-> 108    suffix_dim_size = indices_shape.Size() / prefix_dim_size;
   109
   110    return Status::OK();
   111  }
(lldb) frame variable prefix_dim_size
(int64_t &) prefix_dim_size = 0x000000016bc43030 (&prefix_dim_size = 0)
(lldb) frame variable *prefix_dim_size
(int64_t) *prefix_dim_size = 0
(lldb) n
Process 22526 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x000000015558db7c onnxruntime_pybind11_state.so`onnxruntime::PrepareOutputShape(indices=0x0000600002990000, depth_val=2, axis=-1, prefix_dim_size=0x000000016bc43030, suffix_dim_size=0x000000016bc43028, output_shape=0x000000016bc430d0) at onehot.cc:110:10
   107    }
   108    suffix_dim_size = indices_shape.Size() / prefix_dim_size;
   109
-> 110    return Status::OK();
   111  }

microsoft / onnxruntime