openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
7.23k stars 2.26k forks source link

[Good First Issue]: Align behavior of ONNX Frontend operator BatchNormalization-6, 9, 14, 15 with original framework #20554

Closed gkrivor closed 7 months ago

gkrivor commented 1 year ago

Context

Neural networks are graphs consisting of nodes called operators. Each operator corresponds to a mathematical function, usually described in framework's documentation or an AI standard, such as ONNX. OpenVINO ONNX Frontend is a component responsible for working with ONNX graphs and requires implementation of different ONNX operators in order to use ONNX models. This task requires alignment between OpenVINO ONNX Frontend and original framework implementations of BatchNormalization for next list of opsets: opset 6, opset 9, opset 14, opset 15 Necessary help will be provided by ONNX Fronted team.

What needs to be done?

Operator details can be found in ONNX Operators More details can be found in ONNX Changelog: opset 6, opset 9, opset 14, opset 15

  1. Operator already has a common implementation in OpenVINO. First of all, you need to review a documentation and prepare a table with differences between versions. It could be, for instance, a missing property, extended/reduced coverage of existing property, etc...
  2. Copy existing implementation here to make it aligned with original framework (extend or reduce coverage of a common implementation). Copy of modified implementation should be in a defined opset, or in opset 1 in case it implements oldest implementation. Example of multi-opset operation.
  3. Register the function in ops_bridge.cpp while keeping alphabetical order
  4. Create test model(s) in ONNX models directory. OpenVINO test infrastructure then converts prototxt files to ONNX models - you will use those models later in tests
  5. Add tests covering all use cases here
  6. Check Python xfailed tests to find a test marked as a xfailed for added functionality. If any exist - remove corresponding lines and try to verify by using cmdline "python -m pytest -k name_of_test". More details in adding operators to ONNX Frontend guide

Example Pull Requests

No response

Resources

Contact points

@gkrivor

Ticket

No response

dev-seek commented 10 months ago

.take

github-actions[bot] commented 10 months ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

dev-seek commented 10 months ago

Image

why i got this error and what actions can i take to resolve this error , please guide me

p-wysocki commented 10 months ago

Hello @dev-seek, does this happen when running git submodule update --init?

A user here says that it can be fixed by changing the method to SSH from HTTPS: image Overall this seems to me like some limitation of your network - another user say that it might be due to some complaining proxy or a timeout.

dev-seek commented 10 months ago

yep its occour when i am using git submodule update --init , and ssh too didnt work , what to do now ? is there any other way ?

dev-seek commented 10 months ago

Funally Issue resolved ✌

p-wysocki commented 10 months ago

Great! Could you share what was the solution?

dev-seek commented 10 months ago

Its just the issue of college internet , using my own data solves the problem😅

dev-seek commented 10 months ago

Image

I didnt get the point 2 of what need to be done completely , did it asking for making changes in the above pointed files or i am wrong somewhere , What it mean by copying existing implementation , its already there and if it asking of making new file then what the need of doing that ?, liittle bit confused , needed some explanation please

p-wysocki commented 9 months ago

Could you please take a look @gkrivor?

dev-seek commented 9 months ago

@p-wysocki @gkrivor sry for disturbance , but can u plz clear my doubts asap

dev-seek commented 9 months ago

"This task requires alignment between OpenVINO ONNX Frontend and original framework implementations of BatchNormalization"

how to know the original framework implementation , little bit confused please help .

Does it means that till now we didnt have the implimentation of BatchNormalization which align with original framework implimentations ?

And also frameworks here means (PYTORCH , TENSORFLOW) etc , isn't it ?

dev-seek commented 9 months ago

@p-wysocki can you please re-send the invite of that discord server as the old invite get expired .

p-wysocki commented 9 months ago

Of course, it has been already been corrected in the template, but existing issues still have the old link in them. The permanent link is https://discord.gg/7pVRxUwdWG

gkrivor commented 9 months ago

@dev-seek hi! I'll add link to a issue with similar discussion: https://github.com/openvinotoolkit/openvino/issues/20553

Yes, you need to touch both of files. You should achieve something like this : https://github.com/openvinotoolkit/openvino/blob/master/src/frontends/onnx/frontend/src/op/softmax.cpp which will implement a found differences.

Also you will need to extend a tests for exact implementations like in this PR: https://github.com/openvinotoolkit/openvino/pull/21825

I mean you need to add a prototxt file and corresponding unit test to verify a correctness.

p-wysocki commented 9 months ago

Hi @dev-seek, are you still working on that issue?

dev-seek commented 9 months ago

Yep, need some time On Mon, 29 Jan, 2024, 5:44 pm Przemyslaw Wysocki, @.***> wrote:

Hi @dev-seek https://github.com/dev-seek, are you still working on that issue?

— Reply to this email directly, view it on GitHub https://github.com/openvinotoolkit/openvino/issues/20554#issuecomment-1914571730, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYGSSCPOMMYT2QATOJQGP2LYQ6HB5AVCNFSM6AAAAAA6FMLON2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJUGU3TCNZTGA . You are receiving this because you were mentioned.Message ID: @.***>

p-wysocki commented 8 months ago

Hello @dev-seek, are there any updates?

dev-seek commented 8 months ago

Hey @p-wysocki really sry for the delay Make this issue open again... I wont geeting sufficient time for this as i am busy in somthing else

Bepitic commented 8 months ago

.take

github-actions[bot] commented 8 months ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

Bepitic commented 8 months ago
Here is a List of changes for the versions: Feature / Version BatchNormalization-1 BatchNormalization-6 BatchNormalization-7 BatchNormalization-9 BatchNormalization-14 BatchNormalization-15
Attributes consumed_inputs, epsilon, is_test, momentum, spatial epsilon, is_test, momentum, spatial epsilon, momentum, spatial epsilon, momentum epsilon, momentum, training_mode epsilon, momentum, training_mode
Outputs Y, mean, var, saved_mean, saved_var (training); Y (test) Y, mean, var, saved_mean, saved_var (training); Y (test) Y, mean, var, saved_mean, saved_var (training); Y (test) Y, mean, var, saved_mean, saved_var (training); Y (test) Y, running_mean, running_var (training); Y (test) Y, running_mean, running_var (training); Y (test)
Inputs X, scale, B, mean, var X, scale, B, mean, var X, scale, B, mean, var X, scale, B, mean, var X, scale, B, input_fmean, input_var X, scale, B, input_mean, input_var
Type Constraints T: tensor(float16), tensor(float), tensor(double) T: tensor(float16), tensor(float), tensor(double) T: tensor(float16), tensor(float), tensor(double) T: tensor(float16), tensor(float), tensor(double) T: tensor(float16), tensor(float), tensor(double), tensor(bfloat16); U: tensor(float16), tensor(float), tensor(double), tensor(bfloat16) T: tensor(float16), tensor(float), tensor(double), tensor(bfloat16); T1: tensor(float16), tensor(float), tensor(double), tensor(bfloat16); T2: tensor(float16), tensor(float), tensor(double), tensor(bfloat16)
Changes Attrib: consumed_inputs -> list of ints, legacy purposes. Attrib: Removed consumed_inputs Attrib: Removed is_test Attrib: Removed spatial Attrib: Added training_mode, changed the name of the type constraint (Not the type itself) for mean and var. Changed the name of output variables in training Changed the name of the type constraint (Not the type itself)for mean and var, and also for scale and B(bias)

Also, I have a question. Should I add logic in the code splitting training and Inference, even though the training is not implemented? like:

ov::OutputVector batch_norm(const ov::frontend::onnx::Node& node) {
    ov::OutputVector inputs{node.get_ov_inputs()};
    auto x = inputs.at(0);
    auto scale = inputs.at(1);
    auto bias = inputs.at(2);
    auto mean = inputs.at(3);
    auto var = inputs.at(4);

    double epsilon{node.get_attribute_value<double>("epsilon", 1e-5)};
    double momentum{node.get_attribute_value<double>("momentum", 0.9)};
    double training_mode{node.get_attribute_value<double>("training_mode", 0)}; // 0 = Training
    if (training_mode == true) {
        CHECK_VALID_NODE(node, node.get_outputs_size() == 1, "Training mode of BatchNormalization is not supported.");
        // return {std::make_shared<v5::BatchNormTraining>(x, scale, bias, mean, var, epsilon, momentum)}; // not implemented
    } else {
        CHECK_VALID_NODE(node, node.get_outputs_size() == 1, "Inference set in flag training_mode, but expected more than a single output");
        return {std::make_shared<v5::BatchNormInference>(x, scale, bias, mean, var, epsilon)};
    }

Or something without that much logic, like:

ov::OutputVector batch_norm(const ov::frontend::onnx::Node& node) {
    ov::OutputVector inputs{node.get_ov_inputs()};
    auto x = inputs.at(0);
    auto scale = inputs.at(1);
    auto bias = inputs.at(2);
    auto mean = inputs.at(3);
    auto var = inputs.at(4);

    double epsilon{node.get_attribute_value<double>("epsilon", 1e-5)};
    // Attribute "spatial" is ignored, as we only support inference mode of
    // BatchNormalization

    CHECK_VALID_NODE(node, node.get_outputs_size() == 1, "Training mode of BatchNormalization is not supported.");

    return {std::make_shared<v5::BatchNormInference>(x, scale, bias, mean, var, epsilon)};
}
gkrivor commented 8 months ago

Hi @Bepitic ,

I suggest to extended check in second option: CHECK_VALID_NODE(node, training_mode == false && node.get_outputs_size() == 1, "Training mode of BatchNormalization is not supported.");