Update parse_resize.hpp to handle dynamic shapes

CharlieL7 commented 1 year ago

I have found that yolov4 and retinanet need to parse resize or upsample. Will need to update the onnx parser to handle dynamic shapes into these.

CharlieL7 commented 1 year ago

Will need a new shape operator for handling the ONNX shape operator with dynamic shapes
- dimensions_of operator implemented
- A new compiler pass or addition to split_single_dyn_dim is needed to convert unneeded shape operators to literals
parse_resize should be changed to do what is done in the simplify_reshapes, find_resize compiler pass for the nearest neighbor interpolation. This method should work for dynamic batch (there might be other limitations that have not yet been considered)
- Broadcasting to new dimensions for upsampling
- Use steps for downsampling
Another possible way to change parse_resize would be to have it work differently for dynamic shapes
- gather over batch/channel dimensions
- new operator that get simplified in split_single_dyn_dim

qianqing13579 commented 1 year ago

@CharlieL7 ,I have developed the resize_dynamic. the parse_resize.hpp is here，and the gpu implemention is here

CharlieL7 commented 1 year ago

@qianqing13579 I will look at it sometime this week or the next.

pfultz2 commented 1 year ago

So for nearest mode we can use reshape/multibroadcast/reshape for upsampling by integral values. Downsampling can be handled using the step operator.

To handle non-integral resizes, we can rationalize the number and use a ratio of upsample,downsample integrals. So if we want to upsample by 1.5 we can upsample by 3 and downsample by 2.

To handle the different rounding modes then we would use a different pointer offset(similar to slice). We can add these rounding modes to our step operator and it can automatically apply different pointer offsets as necessary.

As an example, we can start with a simple tensor of 2 elements:

[1, 2]

So to upsample to 1.5x, we would first upsample it using reshape/multibroadcast/reshape by 3x, which would give us a tensor with these values

[1, 1, 1, 2, 2, 2]

Then we apply a step of 2, which gives us:

[1, 1, 2]

This is the result using the floor rounding mode. If we want to use ceil rounding mode, then we would offset the pointer by 1(ie number of steps - 1). Its like doing a slice[axes={0},starts={1},ends{5}], this will yield a tensor like this:

[1, 2, 2]

Which is what the results will be when using ceil rounding mode. The other rounding modes can be implemented by using different offsets as well(n/2 +/- 1). In this case they dont produce a different result.

This should work for both static and dynamic shapes.

bpickrel commented 1 year ago

DynamicResize.docx

bpickrel commented 1 year ago

Draft roadmap attached. This is still incomplete, so please comment and add/change material.

bpickrel commented 1 year ago

I've gone over Paul's Nearest algorithm and it looks good. I think he's inverted the Ceiling and Floor options, but that's a detail that's easily checked out during coding.

The Step operation doesn't yet support dynamic inputs. The others do.

ROCm / AMDMIGraphX

Update parse_resize.hpp to handle dynamic shapes #1670