Open garymm opened 2 years ago
1) Slice(x, starts = [-1], ends = [INT_MIN], steps = [-1]) == x
When you say == x
above in the expected behavior, I think you mean Reverse(x)
because essentially that is what that Slice operation is performing - reversing x.
2) When I did the Slice-10 implementation, the spec had this ambiguous comment here: "For slicing to the end of a dimension with unknown size, it is recommended to pass in INT_MAX". The (obvious but unstated) assumption is that this is for slicing forward and by extension, INT_MIN should be used while slicing in reverse. But since this wasn't explicitly stated in the spec, the implementation was written such that INT_MAX
was a "special" end value (like Numpy's None
) to indicate the "end" of a dimension (while slicing in any direction). I think I asked for clarification on this when the op was versioned for opset-11 (I will dig up the old ONNX PR) and that is why we see this line in the spec now- "For slicing to the end of a dimension with unknown size, it is recommended to pass in INT_MAX when slicing forward and 'INT_MIN' when slicing backward."
I guess we could clean up that special handling of INT_MAX
while slicing backwards and make life easier or we could make this ugly special casing limited to opset-10 since opset-11 and onwards no longer has this ambiguity
3) I haven't yet tested to see if the behavior is correct with INT_MIN, but that should be checked as well.
This should work just fine. It is the very next test in the list - https://github.com/microsoft/onnxruntime/blob/39d1b9e1c176c29d9d6814a8625e675ab39f73dc/onnxruntime/test/providers/cpu/tensor/slice_op.test.cc#L490
Special-casing with associated comment: https://github.com/microsoft/onnxruntime/blob/e23a22451880a1ba1f46b2af6a71bb4a1079108b/onnxruntime/core/providers/cpu/tensor/slice_helper.h#L139
When you say == x above in the expected behavior, I think you mean Reverse(x)
Yes! Updated the issue desc.
I agree with your interpretation of the Slice-10 spec. If we're agreed then I guess we should also:
When you say, "change the ONNX shape inference code to match", do you mean change the ONNX shape inference for opset-10 Slice alone to treat INT_MAX
as a special value (whatever be the direction of slicing) just like the runtime ?
Yes.
Sure. I guess that works.
@garymm : I will update Slice's behavior so that the "special-casing" of INT_MAX is only done for Slice-V10.
I read the latest Slice's spec and I have a couple of things to call out (Please correct me if I am wrong):
https://github.com/onnx/onnx/issues/4115 https://github.com/onnx/onnx/issues/4116
CC: @jcwchen
@hariharans29 discussed this with @gramalingam and we decided that the special casing in ORT is wrong even for Slice-10. The spec doesn't really say that INT_MAX should be treated as a special flag. The behavior going all the way to the end of an axis is meant to be a consequence of setting an index that's larger than all the valid indices.
Special handling of INT_MAX could be problematic if the end
is int32 and the axis has more than INT32_MAX elements.
So we think the behavior should be consistent for all op sets, and that the current behavior is wrong for negative step.
Can you please update the implementation in accordance with this interpretation?
Sure.
I don't feel very strongly about this either way. It was just one of those spot decisions taken 3 years ago while implementing an ambiguously worded spec. :) I will fix the special casing while reverse slicing.
The spec doesn't really say that INT_MAX should be treated as a special flag.
I think "_For slicing to the end of a dimension with unknown size, it is recommended to pass in INTMAX" in the opset 10 Slice spec is open to interpretation, as that sounds like a special flag to me given it doesn't qualify the statement with any mention of the direction. Opset 11 resolved that ambiguity.
Whilst making the behavior consistent for all op sets is desirable, doesn't that ignore potentially breaking an existing model that relies on the current opset 10 behavior? Products using ORT that have backwards compatibility guarantees may disagree with a change to that.
Both tensorflow-onnx and torch.onnx seem to have identical behavior for opset versions 10 and 11, so it seems pretty unlikely that anyone is relying on this difference to me. And even if someone is relying on it, IIUC they could also be relying on it for opset version 11 because the current ORT code doesn't differentiate. So if you want to maintain backwards compatibility, you're going to have to explicitly go against the ONNX spec.
@skottmckay "Products using ORT that have backwards compatibility guarantees may disagree with a change to that." - I believe you are referring to WinML ? @pranavsharma : Should we check with WinML to see if this change in behavior to Slice-10 (which was based on my interpretation of the ambiguous V10 spec of Slice) will break anyone ?
If it is agreed that this change shouldn't affect anyone, #11132 is ready for a review.
WinML is particularly sensitive to breaking changes esp. given that a version of ORT is in the OS now. I don't think anyone can authoritatively say if a customer will be impacted by this. I would err on the side of caution though and be conservative.
Fully maintaining backwards-compatibility means clearly diverging from the ONNX spec for opset version 11, which seems bad to me. And if you're not going to break BC for version 11, it seems better to me to just make it consistent. Just my two cents, you're the owners of this, so do as you think is best.
Describe the bug ORT's Slice tests assert that
Slice(x, starts = [-1], ends = [INT_MAX], steps = [-1]) == Reverse(x)
.ONNX spec says:
I haven't yet tested to see if the behavior is correct with
INT_MIN
, but that should be checked as well.Urgency None.
To Reproduce Build ORT and run
onnxruntime_test_all --gtest_filter='SliceTest.Slice1D_ReverseAllAxes_1'
. Currently the test triggers a shape inference failure which results in a warning being logged.Expected behavior
Slice(x, starts = [-1], ends = [INT_MAX], steps = [-1]) == <empty tensor>
Slice(x, starts = [-1], ends = [INT_MIN], steps = [-1]) == Reverse(x)
Both cases should be tested in ORT and not result in shape inference failures.
Additional context Add any other context about the problem here. If the issue is about a particular model, please share the model details as well to facilitate debugging.