Samsung / ONE

On-device Neural Engine
Other
435 stars 157 forks source link

[compiler] Support the dynamic shape of StridedSlice #13932

Closed shs-park closed 1 month ago

shs-park commented 2 months ago

What

Let's try to support dynamic shape inference for the StridedSlice operation. The main goal of this work is to support models inference of dynamic shapes from #13697.

The below image is a part of some model we are looking into, and the dimension marked with red circle can be dynamic. like <1x18x?x80>.

image

Conditions

The StridedSlice operation cuts a portion of a tensor along a specific axis, and since there are many cases, the implementation can be complicated.

In this issue, let's focus only when the following conditions are satisfied.

(I removed most of below conditions, because of https://github.com/Samsung/ONE/issues/13932#issuecomment-2329169156)


(Added)

Regardless of whether the input is static or dynamic, if either begin or end is not constant, the output of StridedSlice should be converted to dynamic shape.

This means all the dimensions should be ?. i.e., if the input rank is 3, the shape of output will be <?x?x?>


Related from #13697

shs-park commented 2 months ago

Note

https://github.com/Samsung/ONE/pull/13914 was the first DRAFT to support this.

shs-park commented 2 months ago

I'm going to try to create a similar but simpler model to replicate this scenario. I'll share it with you when I'm done. πŸ˜…

qsunki commented 2 months ago

Does the "axis" in the sentence

the input dimension of axis is NOT dynamic

refers to the "axis" in the sentence below?

The StridedSlice operation cuts a portion of a tensor along a specific axis

shs-park commented 2 months ago

refers to the "axis" in the sentence below?

yes!

shs-park commented 2 months ago

I've modify this a little bit..

I found that if the dimension marked with the red circle is dynamic in https://github.com/Samsung/ONE/issues/13932#issue-2504473423, the end of StridedSlice cannot be a constant.

The end needs to be a node with a variable, in order to properly represent the unknown full range of that dimension.

For example:

In this case, the shape of the output should be \<Nx3x2>.

However, since the end includes a variable N, it cannot be a constant. Therefore, the end should be a circle node(not const).


Note: the notation of <> means shape, [ ] means value

shs-park commented 2 months ago

Conclusion

Regardless of whether the input is static or dynamic, if either begin or end is not constant, the output of StridedSlice should be converted to dynamic shape.

This means all the dimensions should be ?. i.e., if the input rank is 3, <?x?x?>

shs-park commented 2 months ago

To-do

I made a test model for this. Please find the below attached file.

Test Model: strided_slice.zip

$ ./onecc -C strided_slice_model_dynamic.cfg

Unzip and run above command, then you will get .circle and .opt.circle files.

Just like the Pad operation in this issue, the output shape of optimized circle model is wrong (<1x1x1>).

This should be (\<?x?x?>).

image

qsunki commented 2 months ago

When I tested this case with TensorFlow Lite, Input shape: <Nx3x10> begin: [0, 0, 0] <3> end: [3, 3, 10] <3> stride: [1, 1, 2] <3>

the results were as shown below. ν™”λ©΄ 캑처 2024-09-04 231520

Wouldn't it be okay to allow end to be a constant, with the appropriate inference logic?

shs-park commented 2 months ago

Wouldn't it be okay to allow end to be a constant, with the appropriate inference logic?

You are right. It seems that all the shape inference cases are already implemented in case that begin and end are both constant. You don't need to modify it.

The only problem seems to be this issue - https://github.com/Samsung/ONE/issues/13932#issuecomment-2329169156.

shs-park commented 2 months ago

Wouldn't it be okay to allow end to be a constant, with the appropriate inference logic?

Oh, I mis-understood your question. Sorry 😭

Yes, that's right. But it's also correct that the dimension is ?. The actual shape is going to be determined at runtime anyway, so it'll work fine.

It's up to you to decide what to do, but we usually follow the policy of leaving it as is in this case. πŸ˜…

qsunki commented 2 months ago

Conclusion

Regardless of whether the input is static or dynamic, if either begin or end is not constant, the output of StridedSlice should be converted to dynamic shape.

This means all the dimensions should be ?. i.e., if the input rank is 3, <?x?x?>

IMHO, In some cases, part of the output's dimensions can be determined.

Example 1: When the dimension of begin, end, or strides is smaller than the input's rank: Input shape: <N1x3x3> begin: [0, 0] <2> end: [N2, N3] <2> strides: [1, 1] <2> output: <?x?x3> The third dimension can be determined.

Example 2: When a new axis is added: new_axis_mask=1 Input shape: <N1x3x3> begin: [0, 0, 0] <3> end: [N2, N3, N4] <3> strides: [1, 1, 1] <3> output: <1x?x?x?> The added dimension can be determined.

shs-park commented 2 months ago

IMHO, In some cases, part of the output's dimensions can be determined.

@qsunki,

Yes, some dimensions can be calculated as you mentioned.

I think we should first implement the requirements in To-do and then think about this further.

FYI, in the optimization process, not only shape inference but also constant folding is performed, and in this process, the input and end could be determined as constants, so it is possible that the current inference code is already good enough.

For these detailed parts, it would be better to first proceed with the To-do requirement, then create a model corresponding to the actual test case you've mentioned, run it, and proceed further if there is a problem.

glistening commented 2 months ago

I made a test model for this. Please find the below attached file.

@Samsung/ssafy_2024 You can generate tflite using tflchef and existing recipes.

For StrideSlice,

icodo98 commented 2 months ago
  • Just modify a few bytes to -1.

I don't understand how to modify bytes in .recipe

I changed

operand {
  name: "ifm"
  type: FLOAT32
  shape { dim: 1 dim: 3 dim: 3 dim: 2 }
}

to

operand {
  name: "ifm"
  type: FLOAT32
  shape { dim: -1 dim: 3 dim: 3 dim: 2 }
}

then tflchef-file fails with Error parsing text-format tflchef.ModelRecipe: 4:16: Expected integer, got: - .

How can I make a recipe with dynamic shape?

qsunki commented 2 months ago

How can I make a recipe with dynamic shape?

try this.

  shape {
    dim: 0
    dim: 8
    dim: 0
    dim: 64
  }
  shape_signature {
    dim: -1
    dim: 8
    dim: -1
    dim: 64
  }
glistening commented 2 months ago

Error parsing text-format tflchef.ModelRecipe: 4:16: Expected integer, got: - .

I don't know the details of tflchef. [^1]

[^1]: Basically I am a runtime guy, not frontend.

Out of curiosity, I searched and found tflchef defined the schema as:

https://github.com/Samsung/ONE/blob/72349cca95c720b09c69bc091b1e642cfb2a0b44/compiler/tflchef/proto/tflchef.proto#L41-L47

Please notice that uint32 for TensorShape, while ShapeSignature is defined as int32.

I didn't know this. My guide was based on circle schema.

circle schema says:

https://github.com/Samsung/ONE/blob/72349cca95c720b09c69bc091b1e642cfb2a0b44/nnpackage/schema/circle_schema.fbs#L224-L228

It is the reason I said a few bytes.

Please modify a few lines as @qsunki wrote.

(Or you may edit a few bytes by editing circle directly using hex editor.)

shs-park commented 1 month ago

Close this issue as related PRs merged