Closed kernhanda closed 2 years ago
Thanks for reporting. With such a through report, it was easy to spot this - this is a clear cut bug in onnx2c. The branch mentioned above fixes this. I want to see if I can make a single-node regression test for this before merging into master.
Slightly related: in order to be able to replicate this issue, I had to fix another bug in the Concat node (patch in the branch above) that caused onnx2c to error out before hitting this. Do you have local changes to onnx2c?
Wow, keen eye! I totally missed the obvious bug fix :)
Good point about the other changes in concat. Here they are
diff --git a/src/nodes/concat.h b/src/nodes/concat.h
index d158bf1..eea1ad6 100644
--- a/src/nodes/concat.h
+++ b/src/nodes/concat.h
@@ -85,7 +85,7 @@ namespace toC {
void resolveOutput(const std::vector<const Tensor *> &inputs, std::vector<Tensor *> &outputs) override {
node_inputs = inputs;
- if (inputs.size() < 2)
+ if (inputs.size() < 1)
ERROR("Wrong number of inputs to Concat");
if (axis < 0)
@@ -98,7 +98,7 @@ namespace toC {
size_t i, j;
std::vector<int> dims = inputs[0]->data_dim;
for (i = 0; i < input_count; i++) {
- for (j = 1; j < dims.size(); j++) {
+ for (j = 0; j < dims.size(); j++) {
if (dims[j] != inputs[i]->data_dim[j] && (int) j != axis)
ERROR("Concat's input tensors must have the same shape, except for the "
"dimension size of the axis to concatenate on.");
Rationate:
Thanks for the diff.
I didn't find an easy way to add a minimized test for this, so I merged this branch issue_15
to master without a test. Looking at the onnx2c test suite there already was a slice_INT64_MAX
test, but that was for ONNX opset 10 or later, where the start/end/axis had moved from attributes to input tensors, so this problem did not repeat... I could not find with quick searching an api call to set the ONNX opset version with the script using scailable onnx which I use to make the unit tests, so I let this one go without a regression test.
About the changes to concat:
I was trying to run onnx2c on a small model of mine (coco_3M.zip), but I ran into the following error:
The same model passes ONNX's model checker, renders fine in netron.app, and executes correctly through ONNXRuntime. I traced the differences to the Slice nodes in this model, which has an
ends
attribute of9223372036854775807
, aka2**63 - 1
, akastd::numeric_limits<int64_t>::max()
. Protobuf for some reason parses this number as-1
, even though the value is being stored in an int64_t. Not sure why that is, but I thought you could take a look.This issue can be seen starting at node
Slice_33
.onnx2c
calculates the output as1x16x14x16
because the ends is read as [-1] instead of [2**63 - 1].The following python script can be used to update models for a workaround: