openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
6.82k stars 2.17k forks source link

[Good First Issue]: Extend ONNX Frontend with Operator Col2Im-18 #20549

Open gkrivor opened 11 months ago

gkrivor commented 11 months ago

Context

Neural networks are graphs consisting of nodes called operators. Each operator corresponds to a mathematical function, usually described in framework's documentation or an AI standard, such as ONNX. OpenVINO ONNX Frontend is a component responsible for working with ONNX graphs and requires implementation of different ONNX operators in order to use ONNX models. This task requires extending OpenVINO ONNX Frontend with Operator Col2Im. Necessary help will be provided by ONNX Fronted team.

What needs to be done?

Operator details can be found in ONNX Operators More details can be found in ONNX Changelog

  1. Create .hpp and .cpp files for *Windows here
  2. Prepare an implementation of this operator in form of a function. Col2Im-18 should be placed in opset 1 namespace.
  3. Register the function in ops_bridge.cpp while keeping alphabetical order
  4. Create test model(s) in ONNX models directory. OpenVINO test infrastructure then converts prototxt files to ONNX models - you will use those models later in tests
  5. Add tests covering all use cases here
  6. Check Python xfailed tests to find a test marked as a xfailed for added functionality. If any exist - remove corresponding lines and try to verify by using cmdline "python -m pytest -k name_of_test". More details in adding operators to ONNX Frontend guide

Example Pull Requests

Resources

Contact points

@gkrivor

Ticket

No response

YaritaiKoto commented 9 months ago

.take

github-actions[bot] commented 9 months ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

p-wysocki commented 9 months ago

Hello @YaritaiKoto, is there anything we can help you with? Do you have any questions?

YaritaiKoto commented 9 months ago

Hi @p-wysocki, I'm doing well, it's just the end of the semester so I'm busy with uni assignments. I'll resume this the first thing in the next week.

YaritaiKoto commented 9 months ago

HI @p-wysocki and @gkrivor, I found this operator quite complicated to implement with existing OpenVINO operators. I think it's better to discuss my plan with you first, so I won't write garbage code and get some good feedback from more experienced developers.

First, for each image, reorganize each input column to shape $(C, K, K)$ using OpenVINO's Reshape op, where $C$ is the number of channels and $K$ is the size of the kernel. This step will convert a column back to a partial image.

Then, after obtaining those converted partial images, use the Pad operator to zero pad them to the original image's size $(C, H, W)$ and add them together to obtain the full original image. This step feels inefficient, maybe there could be better solutions?

There's another side problem. The ONNX's operator supports the dilations parameter, but I didn't find any existing operator or function to undo the dilation and get the original partial image. Any suggestions?

gkrivor commented 9 months ago

Hi! First part sounds as a plan. Question about dilations - need to dig, or @p-wysocki and @mitruska may have any suggestions...

p-wysocki commented 9 months ago

@mitruska could you please take a look? Do you think we have any way to reverse dilations?

mitruska commented 9 months ago

Hello @YaritaiKoto thank you for your research within this topic!

Then, after obtaining those converted partial images, use the Pad operator to zero pad them to the original image's size $(C, H, W)$ and add them together to obtain the full original image. This step feels inefficient, maybe there could be better solutions?

You can consider Concat operation to bind those parts together without padding.

@mitruska could you please take a look? Do you think we have any way to reverse dilations?

About the dilations, based on the onnx test col2im_dilations looks like that the mid values in the output are filled with zeros, so it could be achievable with some Scatter ops logic. Anyway I have no strict recommendation for that, but you can find some inspirations in the conversion logic of the Im2Col operation (from OV Pytorch FE): https://github.com/openvinotoolkit/openvino/blob/master/src/frontends/pytorch/src/op/im2col.cpp#L30

Also dilations is optional attribute, with 1 as default, so you can try to enable the basic case as a first step:)

It's possible that decomposition of Col2Im will be too complicated to implement it by existing ov ops. Such conclusion of the research will be also valuable. Thanks!

YaritaiKoto commented 9 months ago

@mitruska Thanks for your valuable advice!

You can consider Concat operation to bind those parts together without padding.

I think what's preventing me from using concat is that the partial images that each column converts into, could overlap with each other. E.g., imagine performing convolution on a 5x5 image using a 3x3 kernel with stride 1. In that case, the kernels overlap with each other, therefore if we use concat to assemble the converted columns, it could produce an image larger than the original one.

About the dilations, based on the onnx test col2im_dilations looks like that the mid values in the output are filled with zeros, so it could be achievable with some Scatter ops logic. Anyway I have no strict recommendation for that, but you can find some inspirations in the conversion logic of the Im2Col operation (from OV Pytorch FE): https://github.com/openvinotoolkit/openvino/blob/master/src/frontends/pytorch/src/op/im2col.cpp#L30

Also dilations is optional attribute, with 1 as default, so you can try to enable the basic case as a first step:)

It's possible that decomposition of Col2Im will be too complicated to implement it by existing ov ops. Such conclusion of the research will be also valuable. Thanks!

Thanks! I'll dig into the ONNX's tests and implementation. That's a very useful resource. The reason why im2col's implementation doesn't apply to col2im is that, I think, im2col's picking values from a larger matrix to a smaller one, so it can use gather op. But col2im is adding values from a small matrix to a part of a big matrix, and that is hard to correspond to an existing op. My current opinion is that this op could be able to be represented by a series of ov ops, but it will require very complex transformations such as building a Loop op to add padded matrices. Most importantly, if we use the padding solution to add matrices, this might also cause terrible performance. Maybe the better solution is to add an ov op in the backend instead of trying to decompose it into existing ops.

p-wysocki commented 8 months ago

Hello @YaritaiKoto, please let us know if you have any questions or if you want to discuss the solution. :)

I am happy to announce that we have created a channel dedicated to Good First Issues support on our Intel DevHub Discord server! Join it to receive support, engage in discussions, ask questions and talk to OpenVINO developers.

YaritaiKoto commented 8 months ago

Hi @p-wysocki, thank you for the information, I've joined the server!

As for this operator, I think the issue remains the same - it's complicated and inefficient to convert the col2im operator to existing OV operators and implementing it as a backend operator might be a better solution - in that way, we have control over the actual input data. You can read my last comment for more information.

mlukasze commented 8 months ago

you are right about col2im and it is on the roadmap. however, whatever relies on this op can be implemented anyhow. when col2im will be fully implemented as core operator all its dependencies will benefit from it, but in the meantime we will confirm functional correctness of them.

YaritaiKoto commented 7 months ago

In that case, shall we still implement a version of col2im with a composition of existing OV operators? Since I can't come up with a way of simply un-doing dilations and reconstructing the image, maybe we have to construct a subgraph and use the Loop operator to do those works, which could be very memory and compute inefficient. If the operator will be implemented in the near future, maybe we can just wait for that and simply substitute the ONNX op with the OV op.

DaBaoDIY commented 5 months ago

WLB#+.take

github-actions[bot] commented 5 months ago

Thanks for being interested in this issue. It looks like this ticket is already assigned to a contributor. Please communicate with the assigned contributor to confirm the status of the issue.

YaritaiKoto commented 5 months ago

Hi maintainers, I'm no longer working on this issue, so you can reassign this issue to other contributors if needed :)

mlukasze commented 5 months ago

Thank you @YaritaiKoto for your time and interest! We are looking forward for further cooperation with you :)

@DaBaoDIY - the task is yours :)

kshitij01042002 commented 5 months ago

Hi @mlukasze if the current contributor isn't working on the issue, can I take this up?

mlukasze commented 5 months ago

hey @DaBaoDIY - do you want to work on this, or you have no time at this point?

p-wysocki commented 5 months ago

Perhaps we should wait for native support for Col2Im in OpenVINO Core to avoid difficulties @YaritaiKoto discovered.

OV Core Col2Im specification: https://github.com/openvinotoolkit/openvino/pull/23947

@YaritaiKoto I'd be thankful for your review, since you're already knowledgeable about the topic :)

p-wysocki commented 2 months ago

A related PR extending CPU Plugin with Col2Im is in review: https://github.com/openvinotoolkit/openvino/pull/24931. As soon as it's merged the task can be picked up again and use Col2Im natively.

gkrivor commented 1 month ago

.take

github-actions[bot] commented 1 month ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.